Gabriel Tseng

Galileo: Learning Global & Local Features of Many Remote Sensing Modalities

Gabriel Tseng

Anthony Fuller

Marlena Reil

Henry Herzog

Patrick Beukema

Favyen Bastani

James R Green

Evan Shelhamer

Hannah Kerner

David Rolnick

We introduce a highly multimodal transformer to represent many remote sensing modalities - multispectral optical, synthetic aperture radar, … (see more)elevation, weather, pseudo-labels, and more - across space and time. These inputs are useful for diverse remote sensing tasks, such as crop mapping and flood detection. However, learning shared representations of remote sensing data is challenging, given the diversity of relevant data modalities, and because objects of interest vary massively in scale, from small boats (1-2 pixels and fast) to glaciers (thousands of pixels and slow). We present a novel self-supervised learning algorithm that extracts multi-scale features across a flexible set of input modalities through masked modeling. Our dual global and local contrastive losses differ in their targets (deep representations vs. shallow input projections) and masking strategies (structured vs. not). Our Galileo is a single generalist model that outperforms SoTA specialist models for satellite images and pixel time series across eleven benchmarks and multiple tasks.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

proceedings.mlr.press

Task-Informed Meta-Learning for Remote Sensing

Gabriel Tseng

Hannah Kerner

David Rolnick

2025-06-11

2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (published)

doi.org

Deploying Geospatial Foundation Models in the Real World: Lessons from WorldCereal

Christina Butsko

Kristof Van Tricht

Gabriel Tseng

Giorgia Milli

David Rolnick

Ruben Cartuyvels

Inbal Becker Reshef

Zoltan Szantoi

Hannah Kerner

The increasing availability of geospatial foundation models has the potential to transform remote sensing applications such as land cover cl… (see more)assification, environmental monitoring, and change detection. Despite promising benchmark results, the deployment of these models in operational settings is challenging and rare. Standardized evaluation tasks often fail to capture real-world complexities relevant for end-user adoption such as data heterogeneity, resource constraints, and application-specific requirements. This paper presents a structured approach to integrate geospatial foundation models into operational mapping systems. Our protocol has three key steps: defining application requirements, adapting the model to domain-specific data and conducting rigorous empirical testing. Using the Presto model in a case study for crop mapping, we demonstrate that fine-tuning a pre-trained model significantly improves performance over conventional supervised methods. Our results highlight the model’s strong spatial and temporal generalization capabilities. Our protocol provides a replicable blueprint for practitioners and lays the groundwork for future research to operationalize foundation models in diverse remote sensing applications. Application of the protocol to the WorldCereal global crop-mapping system showcases the framework’s scalability.

2025-06-10

ICML.cc/2025/Workshop/TerraBytes (published)

doi.org

openreview.net

Galileo: Learning Global&Local Features of Many Remote Sensing Modalities

Gabriel Tseng

Anthony Fuller

Marlena Reil

Henry Herzog

Patrick Beukema

Favyen Bastani

James R Green

Evan Shelhamer

Hannah Kerner

David Rolnick

We introduce a highly multimodal transformer to represent many remote sensing modalities - multispectral optical, synthetic aperture radar, … (see more)elevation, weather, pseudo-labels, and more - across space and time. These inputs are useful for diverse remote sensing tasks, such as crop mapping and flood detection. However, learning shared representations of remote sensing data is challenging, given the diversity of relevant data modalities, and because objects of interest vary massively in scale, from small boats (1-2 pixels and fast) to glaciers (thousands of pixels and slow). We present a novel self-supervised learning algorithm that extracts multi-scale features across a flexible set of input modalities through masked modeling. Our dual global and local contrastive losses differ in their targets (deep representations vs. shallow input projections) and masking strategies (structured vs. not). Our Galileo is a single generalist model that outperforms SoTA specialist models for satellite images and pixel time series across eleven benchmarks and multiple tasks.

2025-05-01

ICML.cc/2025/Conference (poster)

proceedings.mlr.press

openreview.net

Galileo: Learning Global and Local Features in Pretrained Remote Sensing Models

Gabriel Tseng

A. Fuller

Marlena Reil

Henry Herzog

Patrick Beukema

Favyen Bastani

James R. Green

Evan Shelhamer

Hannah Kerner

David Rolnick

From crop mapping to flood detection, machine learning in remote sensing has a wide range of societally beneficial applications. The commona… (see more)lities between remote sensing data in these applications present an opportunity for pretrained machine learning models tailored to remote sensing to reduce the labeled data and effort required to solve individual tasks. However, such models must be: (i) flexible enough to ingest input data of varying sensor modalities and shapes (i.e., of varying spatial and temporal dimensions), and (ii) able to model Earth surface phenomena of varying scales and types. To solve this gap, we present Galileo, a family of pretrained remote sensing models designed to flexibly process multimodal remote sensing data. We also introduce a novel and highly effective self-supervised learning approach to learn both large- and small-scale features, a challenge not addressed by previous models. Our Galileo models obtain state-of-the-art results across diverse remote sensing tasks.

2025-02-13

ArXiv (preprint)

arxiv.org

Galileo: Learning Global and Local Features in Pretrained Remote Sensing Models

Gabriel Tseng

Anthony Fuller

Marlena Reil

Henry Herzog

Patrick Beukema

Favyen Bastani

James R Green

Evan Shelhamer

Hannah Kerner

David Rolnick

From crop mapping to flood detection, machine learning in remote sensing has a wide range of societally beneficial applications. The commona… (see more)lities between remote sensing data in these applications present an opportunity for pretrained machine learning models tailored to remote sensing to reduce the labeled data and effort required to solve individual tasks. However, such models must be: (i) flexible enough to ingest input data of varying sensor modalities and shapes (i.e., of varying spatial and temporal dimensions), and (ii) able to model Earth surface phenomena of varying scales and types. To solve this gap, we present Galileo, a family of pretrained remote sensing models designed to flexibly process multimodal remote sensing data. We also introduce a novel and highly effective self-supervised learning approach to learn both large- and small-scale features, a challenge not addressed by previous models. Our Galileo models obtain state-of-the-art results across diverse remote sensing tasks.

2025-02-13

ArXiv (preprint)

doi.org

arxiv.org

Galileo: Learning Global&Local Features of Many Remote Sensing Modalities

Gabriel Tseng

A. Fuller

Marlena Reil

Henry Herzog

Patrick Beukema

Favyen Bastani

James R. Green

Evan Shelhamer

Hannah Kerner

David Rolnick

We introduce a highly multimodal transformer to represent many remote sensing modalities - multispectral optical, synthetic aperture radar, … (see more)elevation, weather, pseudo-labels, and more - across space and time. These inputs are useful for diverse remote sensing tasks, such as crop mapping and flood detection. However, learning shared representations of remote sensing data is challenging, given the diversity of relevant data modalities, and because objects of interest vary massively in scale, from small boats (1-2 pixels and transient) to glaciers (thousands of pixels and persistent). We present a novel self-supervised learning algorithm that extracts multi-scale features across a flexible set of input modalities through masked modeling. Our dual global and local contrastive losses differ in their targets (deep representations vs. shallow input projections) and masking strategies (structured vs. not). Our Galileo is a single generalist model that outperforms SoTA specialist models for satellite images and pixel time series across eleven benchmarks and multiple tasks.

2025-02-13

ArXiv (preprint)

arxiv.org

Task-Informed Meta-Learning for Remote Sensing

Gabriel Tseng

Hannah Kerner

David Rolnick

2025-01-01

CVPR Workshops (published)

dblp.uni-trier.de

Lightweight, Pre-trained Transformers for Remote Sensing Timeseries

Gabriel Tseng

Ruben Cartuyvels

Ivan Zvonkov

Mirali Purohit

David Rolnick

Hannah Kerner

Machine learning methods for satellite data have a range of societally relevant applications, but labels used to train models can be difficu… (see more)lt or impossible to acquire. Self-supervision is a natural solution in settings with limited labeled data, but current self-supervised models for satellite data fail to take advantage of the characteristics of that data, including the temporal dimension (which is critical for many applications, such as monitoring crop growth) and availability of data from many complementary sensors (which can significantly improve a model's predictive performance). We present Presto (the Pretrained Remote Sensing Transformer), a model pre-trained on remote sensing pixel-timeseries data. By designing Presto specifically for remote sensing data, we can create a significantly smaller but performant model. Presto excels at a wide variety of globally distributed remote sensing tasks and performs competitively with much larger models while requiring far less compute. Presto can be used for transfer learning or as a feature extractor for simple models, enabling efficient deployment at scale.

2023-04-27

ArXiv (preprint)

doi.org

openreview.net

Semi-Supervised Object Detection for Agriculture

Gabriel Tseng

Krisztina Sinkovics

Tom Watsham

David Rolnick

Thomas C. Walters

2023-01-26

AAAI.org/2023/Workshop/AIAFS (poster)

openreview.net

TIML: Task-Informed Meta-Learning for Agriculture

Gabriel Tseng

Hannah Kerner

David Rolnick

Labeled datasets for agriculture are extremely spatially imbalanced. When developing algorithms for data-sparse regions, a natural approach … (see more)is to use transfer learning from data-rich regions. While standard transfer learning approaches typically leverage only direct inputs and outputs, geospatial imagery and agricultural data are rich in metadata that can inform transfer learning algorithms, such as the spatial coordinates of data-points or the class of task being learned. We build on previous work exploring the use of meta-learning for agricultural contexts in data-sparse regions and introduce task-informed meta-learning (TIML), an augmentation to model-agnostic meta-learning which takes advantage of task-specific metadata. We apply TIML to crop type classification and yield estimation, and find that TIML significantly improves performance compared to a range of benchmarks in both contexts, across a diversity of model architectures. While we focus on tasks from agriculture, TIML could offer benefits to any meta-learning setup with task-specific metadata, such as classification of geo-tagged images and species distribution modelling.

2022-02-04

ArXiv (preprint)

arxiv.org

Mila AI Policy Conference

Leading in a New Era

TRAIL: Responsible AI for Professionals and Leaders

Gabriel Tseng

Blog Posts

Publications

Mila AI Policy Conference

Leading in a New Era

TRAIL: Responsible AI for Professionals and Leaders

Popular keywords:

Gabriel Tseng

Blog Posts

Publications