Geoscience Foundation Models

Link to section 'What is Geoscience Foundation Models' of 'Geoscience Foundation Models' What is Geoscience Foundation Models

Geoscience foundation models (GFMs) are large-scale, general-purpose artificial intelligence models pre-trained on vast, multi-modal datasets related to Earth systems. They learn a broad base of knowledge about the planet's dynamics and can be adapted (fine-tuned) for a wide range of downstream geoscience tasks, such as weather forecasting, climate modeling, and remote sensing applications.

They are availble now on Anvil, Gilbreth, and Gautschi.

Link to section 'Deployed GFMs' of 'Geoscience Foundation Models' Deployed GFMs

Prithvi-EO-2.0

Prithvi-EO-2.0 is the second generation EO foundation model jointly developed by IBM, NASA, and Jülich Supercomputing Centre.

The models were pre-trained at the Jülich Supercomputing Centre with NASA's HLS V2 product (30m granularity) using 4.2M samples with six bands in the following order: Blue, Green, Red, Narrow NIR, SWIR, SWIR 2.

They are four models (300M, 300M-TL, 600M, and 600M-TL) which varies on the number of parameters and with/without temporal and location embeddings.

Check their Model Card here

Link to section 'Available models can be searched as below:' of 'Prithvi-EO-2.0' Available models can be searched as below:


liu4201@login02.anvil:[~] $ module spider Prithvi-EO-2.0

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Prithvi-EO-2.0:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Description:
      Pretrained 600M parameter model with temporal and location embeddings.

     Versions:
        Prithvi-EO-2.0/300M-TL-2025-03-24
        Prithvi-EO-2.0/300M-2025-03-24
        Prithvi-EO-2.0/600M-TL-2025-03-24
        Prithvi-EO-2.0/600M-2025-03-24

Link to section 'Any available model above can be loaded and used as below:' of 'Prithvi-EO-2.0' Any available model above can be loaded and used as below:

module load gfms
module load Prithvi-EO-2.0/600M-2025-03-24

With using python, the model could then be accessed via $MODEL_DIR (as below)

>>> import os 
>>> model_path = os.getenv("MODEL_DIR") 
>>> model_path 
'/apps/gfms/Prithvi-EO-2.0/Prithvi-EO-2.0-600M'

Clay

Clay Foundation Model is an open-source foundational model of Earth. It uses an expanded visual transformer upgraded to understand geospatial and temporal relations on Earth data. The model is trained as a self-supervised Masked Autoencoder (MAE).

The Clay model can be used in three main ways:

Generate semantic embeddings for any location and time.
Fine-tune the model for downstream tasks such as classification, regression, and generative tasks.
Use the model as a backbone for other models.

Check their Model Card here

To load the model to use, see the code below:

module load gfms
module load Clay

With using python, the model could then be accessed via $MODEL_DIR (as below)

>>> import os 
>>> model_path = os.getenv("MODEL_DIR") 
>>> model_path 
'/apps/gfms/Clay' 
>>> from claymodel.datamodule import ClayDataModule
>>> from claymodel.module import ClayMAEModule
>>> model = ClayMAEModule.load_from_checkpoint(model_path + "/clay-v1.5.ckpt")

The model could also be used with jupyter notebook, see the code below and then use select kernel named "gfms_clay":

module load jupyter
module load gfms
module load Clay

Then start jupyter notebook in a interactive job by running:

jupyter notebook

Note the module jupyter must be loaded before Clay to have the gfms_clay kernel to be found. The model could also be accessed via $MODEL_DIR.

Aurora

Aurora is a machine learning model that can predict atmospheric variables, such as temperature. It is a foundation model, which means that it was first generally trained on a lot of data, and then can adapted to specialised atmospheric forecasting tasks with relatively little data.

They provide four such specialised versions: one for medium-resolution weather prediction, one for high-resolution weather prediction, one for air pollution prediction, and one for ocean wave prediction.

Check their Model Card here

To load the model to use, see the code below:

module load gfms
module load Aurora

With using python, the model could then be used (as example below-from Aurora Document)

>>> from aurora import Aurora 
>>> model = Aurora() 
>>> from aurora import AuroraSmallPretrained 
>>> model = AuroraSmallPretrained()
>>> model.load_checkpoint("microsoft/aurora", "aurora-0.25-small-pretrained.ckpt")

The model could also be used with jupyter notebook, see the code below and then use select kernel named "gfms_aurora":

module load jupyter
module load gfms
module load Aurora

Then start jupyter notebook in a interactive job by running:

jupyter notebook

Note the module jupyter must be loaded before Aurora to have the gfms_aurora kernel to be found. The model could also be accessed via $MODEL_DIR.

TerraMind-1.0

TerraMind is the first multimodal any-to-any generative foundation model for Earth Observation jointly developed by IBM, ESA, and Forschungszentrum Jülich.

TerraMind uses a dual-scale transformer-based encoder-decoder architecture, simultaneously processing pixel-level and token-level data. The model was pre-trained on 500B tokens from 9M spatiotemporally aligned multimodal samples from the TerraMesh dataset.

Modality-specific patch embeddings allow direct processing of raw inputs, while modality-specific FSQ-VAEs are used for image tokenization. For sequence-like modalities such as coordinates, an adapted WordPiece tokenizer is employed. During pre-training, TerraMind leverages masked token reconstruction, learning complex cross-modal correlations to generate high-quality latent representations.

Check their Model Card here

Link to section 'Available models can be searched as below:' of 'TerraMind-1.0' Available models can be searched as below:


liu4201@login02.anvil:[~] $ module spider TerraMind-1.0

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  TerraMind-1.0:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Description:
      TerraMind is the first multimodal any-to-any generative foundation model for Earth Observation jointly developed by IBM, ESA, and Forschungszentrum Jülich.

     Versions:
        TerraMind-1.0/base-2025-08-22
        TerraMind-1.0/large-2025-08-22

Link to section 'Any available model above can be loaded and used as below:' of 'TerraMind-1.0' Any available model above can be loaded and used as below:

module load gfms
module load TerraMind-1.0/large-2025-08-22
module load TerraMind-1.0/base-2025-08-22

With using python, the model could then be accessed via $MODEL_DIR (as below)

>>> import os 
>>> model_path = os.getenv("MODEL_DIR") 
>>> model_path 
'/apps/gfms/TerraMind-1.0/TerraMind-1.0-large'