David Rolnick

Biography

David Rolnick is an assistant professor at McGill University’s School of Computer Science, a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair. Rolnick’s work focuses on applications of machine learning to help address climate change. He is the co-founder and chair of Climate Change AI, and scientific co-director of Sustainability in the Digital Age. After completing his PhD in applied mathematics at the Massachusetts Institute of Technology (MIT), he was a NSF Mathematical Sciences Postdoctoral Research Fellow, an NSF Graduate Research Fellow and a Fulbright Scholar. He was named to MIT Technology Review’s “35 Innovators Under 35” in 2021.

Current Students

Benjamin Akera Binen

Collaborating Alumni - McGill University

Collaborating researcher - Cambridge University

Co-supervisor :

Postdoctorate - McGill University

Michael Bunsen

Collaborating researcher - McGill University

Shahana Chatterjee

Collaborating researcher - N/A

Co-supervisor :

Yuyan Chen

PhD - McGill University

Eya Cherif

Collaborating researcher - Leipzig University

Othmane Echchabi

Master's Research - McGill University

Collaborating researcher

Mohamed Elabbas

Collaborating researcher

Jannik Endres

Collaborating researcher

Jacopo Ghirri

Independent visiting researcher - Politecnico di Milano

Independent visiting researcher

Collaborating researcher - Johannes Kepler University

Christina Isaicu Isaicu

Collaborating researcher - University of Amsterdam

Gaurav Iyer

Master's Research - McGill University

Julia Kaltenborn

PhD - McGill University

Devin Kwok

PhD - McGill University

Independent visiting researcher - Université de Montréal

Pierre-Louis Lemaire

Collaborating researcher - Polytechnique Montréal Montréal

Principal supervisor :

Alex Hernández-García

David Mickisch

Collaborating researcher

Postdoctorate - McGill University

Co-supervisor :

Lena Podina

Collaborating researcher - University of Waterloo

Co-supervisor :

Marlena Reil

Master's Research - McGill University

Rasha Saha

Master's Research - McGill University

Luca Marie Schmidt

Collaborating researcher - University of Tübingen

Independent visiting researcher - Karlsruhe Institute of Technology

Wietze Suijker

Independent visiting researcher

Ilija Trajković

Collaborating researcher - Karlsruhe Institute of Technology

Gabriel Tseng

PhD - McGill University

Donna Vakalis

Collaborating Alumni - Université de Montréal

Principal supervisor :

Collaborating researcher

Catherine Villeneuve

PhD - McGill University

Mohamed Yebari

Collaborating researcher - Ecole Polytechnique Montréal Fédérale de Lausanne (EPFL)

Co-supervisor :

Loubna Benabbou

Shan Zhao

Collaborating researcher - Technical University of Munich

Democratizing Access to Satellite Data with AI

Blog Posts

diagram illustrating how the AI foundation model for Earth observation, Galileo, works

October 21, 2025

Gabriel Tseng

David Rolnick

Read the article

Publications

Estimating Individual Tree Height and Species from UAV Imagery

J. Endres

Étienne Laliberté

Arthur Ouaknine

Accurate estimation of forest biomass, a major carbon sink, relies heavily on tree-level traits such as height and species. Unoccupied Aeria… (see more)l Vehicles (UAVs) capturing high-resolution imagery from a single RGB camera offer a cost-effective and scalable approach for mapping and measuring individual trees. We introduce BIRCH-Trees, the first benchmark for individual tree height and species estimation from tree-centered UAV images, spanning three datasets: temperate forests, tropical forests, and boreal plantations. We also present DINOvTree, a unified approach using a Vision Foundation Model (VFM) backbone with task-specific heads for simultaneous height and species prediction. Through extensive evaluations on BIRCH-Trees, we compare DINOvTree against commonly used vision methods, including VFMs, as well as biological allometric equations. We find that DINOvTree achieves top overall results with accurate height predictions and competitive classification accuracy while using only 54% to 58% of the parameters of the second-best approach.

2026-03-23

arXiv (preprint)

BATIS: Bayesian Approaches for Targeted Improvement of Species Distribution Models

Catherine Villeneuve

Benjamin Akera

Mélisande Teng

Species distribution models (SDMs), which aim to predict species occurrence based on environmental variables, are widely used to monitor and… (see more) respond to biodiversity change. Recent deep learning advances for SDMs have been shown to perform well on complex and heterogeneous datasets, but their effectiveness remains limited by spatial biases in the data. In this paper, we revisit deep SDMs from a Bayesian perspective and introduce BATIS, a novel and practical framework wherein prior predictions are updated iteratively using limited observational data. Models must appropriately capture both aleatoric and epistemic uncertainty to effectively combine fine-grained local insights with broader ecological patterns. We benchmark an extensive set of uncertainty quantification approaches on a novel dataset including citizen science observations from the eBird platform. Our empirical study shows how Bayesian deep learning approaches can greatly improve the reliability of SDMs in data-scarce locations, which can contribute to ecological understanding and conservation efforts.

2026-03-13

AAAI Conference on Artificial Intelligence (published)

Learning a Spatial Partitioning and its Causal Relations from Temporal Data

Yaniv Gurwicz

Peer Nowack

Jakob Runge

Scientific research often seeks to understand the causal structure underlying high-level variables in a system. For example, climate scienti… (see more)sts study how phenomena, such as El Niño, affect other climate processes at remote locations across the globe. However, scientists typically collect low-level measurements, such as geographically distributed temperature readings. From these, one needs to learn both a mapping to causally-relevant latent variables, such as a high-level representation of the El Niño phenomenon and other processes, as well as the causal model over them. The challenge is that this task, called causal representation learning, is highly underdetermined from observational data alone, requiring other constraints during learning to resolve the indeterminacies. In this work, we consider the task of partitioning observed variables into disentangled factors, such as extracting regions from geographically gridded measurement data in climate research or capturing brain regions from neural activity data. We demonstrate the identifiability of the resulting model and propose a differentiable method, Causal Discovery with Single-parent Decoding (CDSD), that simultaneously learns, from temporal data, the underlying latents and a causal graph over them. We assess the validity of our theoretical results using simulated data and showcase the practical validity of our method in an application to real-world data from the climate science field.

2026-03-09

Conference on Causal Learning and Reasoning (poster)

openreview.net

Understanding Representation Gaps across Scales in Tropical Tree Species Classification from Drone Imagery

Sulagna Saha

Arthur Ouaknine

Étienne Laliberté

Carol Altimas

Evan M. Gora

Adriane Esquivel Muelbert

Ian R. McGregor

Cesar Gutierrez

Vanessa E. Rubio

Accurate classification of tropical tree species from unoccupied aerial vehicle (UAV) imagery remains challenging due to high species divers… (see more)ity and strong visual similarity among species at typical image resolutions (centimeters per pixel). In contrast, models trained on close-up citizen science photographs captured with smartphones achieve strong plant species classification performance. Recent advances in UAV data acquisition now enable the collection of close-up images that are spatially registered with top-view aerial imagery and approach the level of visual detail found in smartphone photographs, with the trade-off that such high-resolution photos cannot be acquired for many trees. In this work, we evaluate the performance of existing methods using paired top-view and close-up UAV imagery collected in a species-rich tropical forest. Through fine-tuning experiments, we quantify the performance gap between vision foundation models and in-domain generalist plant recognition models across both image types (high-resolution close-up versus coarser-resolution top-view imagery). We show that classification performance is consistently higher on close-up images than on top-view aerial imagery, and that this performance gap widens for rare species. Finally, we propose that self-supervised representation alignment across these two spatial scales offers a promising approach for integrating fine-grained visual information into canopy-level species classification models based on top-view UAV imagery. Leveraging high-resolution close-up UAV imagery to enhance canopy-level species classification could substantially improve large-scale monitoring of tropical forest biodiversity.

2026-02-28

ML4RS @ International Conference on Learning Representations (published)

openreview.net

Localized, High-resolution Geographic Representations with Slepian Functions

Arjun Rao

Ruth Crasto

Tessa Ooms

Konstantin Klemmer

Marc Rußwurm

Geographic data is fundamentally local. Disease outbreaks cluster in population centers, ecological patterns emerge along coastlines, and ec… (see more)onomic activity concentrates within country borders. Machine learning models that encode geographic location, however, distribute representational capacity uniformly across the globe, struggling at the fine-grained resolutions that localized applications require. We propose a geographic location encoder built from spherical Slepian functions that concentrate representational capacity inside a region-of-interest and scale to high resolutions without extensive computational demands. For settings requiring global context, we present a hybrid Slepian-Spherical Harmonic encoder that efficiently bridges the tradeoff between local-global performance, while retaining desirable properties such as pole-safety and spherical-surface-distance preservation. Across five tasks spanning classification, regression, and image-augmented prediction, Slepian encodings outperform baselines and retain performance advantages across a wide range of neural network architectures.

2026-01-29

arXiv (preprint)

Benchmarking the geographic generalization of deep learning models for precipitation downscaling

Paula Harder

Luca Schmidt

Francis Pelletier

Nicole Ludwig

Matthew Chantry

Christian Lessig

Alex Hernández-García

Earth System Models (ESM) are our main tool for projecting the impacts of climate change. However, running these models at sufficient resolu… (see more)tion for local-scale risk-assessments is not computationally feasible. Deep learning-based super-resolution models offer a promising solution to downscale ESM outputs to higher resolutions by learning from data. Yet, due to regional variations in climatic processes, these models typically require retraining for each geographical area–demanding high-resolution observational data, which is unevenly available across the globe. This highlights the need to assess how well these models generalize across geographic regions. To address this, we introduce RainShift, a dataset and benchmark for evaluating downscaling under geographic distribution shifts. We evaluate state-of-the-art downscaling approaches including GANs and diffusion models in generalizing across data gaps between the Global North and Global South. Our findings reveal substantial performance drops in out-of-distribution regions, depending on model and geographic area. While expanding the training domain generally improves generalization, it is insufficient to overcome shifts between geographically distinct regions. We show that addressing these shifts through, for example, domain adaptation can improve spatial generalization. Our work advances the global applicability of downscaling methods and represents a step toward reducing inequities in access to high-resolution climate information.

2026-01-26

Scientific Reports (published)

In-Context Reinforcement Learning through Bayesian Fusion of Context and Value Prior

Anaïs Berkes

In-context reinforcement learning (ICRL) promises fast adaptation to unseen environments without parameter updates, but current methods eith… (see more)er cannot improve beyond the training distribution or require near-optimal data, limiting practical adoption. We introduce SPICE, a Bayesian ICRL method that learns a prior over Q-values via deep ensemble and updates this prior at test-time using in-context information through Bayesian updates. To recover from poor priors resulting from training on sub-optimal data, our online inference follows an Upper-Confidence Bound rule that favours exploration and adaptation. We prove that SPICE achieves regret-optimal behaviour in both stochastic bandits and finite-horizon MDPs, even when pretrained only on suboptimal trajectories. We validate these findings empirically across bandit and control benchmarks. SPICE achieves near-optimal decisions on unseen tasks, substantially reduces regret compared to prior ICRL and meta-RL approaches while rapidly adapting to unseen tasks and remaining robust under distribution shift.

2025-12-31

arXiv (preprint)

Adsorption energies are necessary but not sufficient to identify good catalysts

Alexander Davis

Alexandre AGM Duval

Oleksandr Voznyy

Alex Hern'andez-Garcia

Félix Therrien

2025-12-04

ArXiv (preprint)

Deploying Geospatial Foundation Models in the Real World: Lessons from WorldCereal

Christina Butsko

Gabriel Tseng

Kristof Van Tricht

Giorgia Milli

Ruben Cartuyvels

Inbal Becker Reshef

Zoltan Szantoi

Hannah Kerner

The increasing availability of geospatial foundation models has the potential to transform remote sensing applications such as land cover cl… (see more)assification, environmental monitoring, and change detection. Despite promising benchmark results, the deployment of these models in operational settings is challenging and rare. Standardized evaluation tasks often fail to capture real-world complexities relevant for end-user adoption such as data heterogeneity, resource constraints, and application-specific requirements. This paper presents a structured approach to integrate geospatial foundation models into operational mapping systems. Our protocol has three key steps: defining application requirements, adapting the model to domain-specific data and conducting rigorous empirical testing. Using the Presto model in a case study for crop mapping, we demonstrate that fine-tuning a pre-trained model significantly improves performance over conventional supervised methods. Our results highlight the model’s strong spatial and temporal generalization capabilities. Our protocol provides a replicable blueprint for practitioners and lays the groundwork for future research to operationalize foundation models in diverse remote sensing applications. Application of the protocol to the WorldCereal global crop-mapping system showcases the framework’s scalability.

2025-12-01

Proceedings of The TerraBytes {ICML} Workshop: Towards global datasets and models for Earth Observation (published)

proceedings.mlr.press

On Global Applicability and Location Transferability of Generative Deep Learning Models for Precipitation Downscaling

Paula Harder

Christian Lessig

Matthew Chantry

Francis Pelletier

2025-11-30

ArXiv (preprint)

A HOT Dataset: 150,000 Buildings for HVAC Operations Transfer Research

Anaïs Berkes

Donna Vakalis

About 12% of global energy consumption is attributable to heating, ventilation, and air conditioning (HVAC) systems in buildings [11]. Machi… (see more)ne learning-based intelligent HVAC control offers significant energy efficiency potential, but progress is constrained by limited data for training and evaluating performance across different kinds of buildings. Existing datasets primarily target energy prediction rather than control applications, forcing studies to rely on limited building sets or single-variable perturbations that fail to capture real-world complexity. We present HOT (HVAC Operations Transfer), the first large-scale open-source dataset purpose-built for research into transfer learning in building control. HOT contains 159,744 unique building-weather combinations with systematic variations across envelope properties, occupancy patterns, and climate conditions spanning all 19 ASHRAE climate zones across 76 global locations. We formalise a comprehensive similarity-based framework with quantitative metrics for assessing transfer feasibility between source and target buildings across multiple context dimensions. Our key contributions: (1) a large-scale, open dataset and tooling enabling systematic, multi-variable transfer studies across 19 climate zones; (2) a quantitative similarity framework spanning geometry, thermal, climate, and function; and (3) zero-shot climate transfer experiments showing why realistic context variation matters for HVAC control.

2025-11-10

ACM Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (published)

A HOT Dataset: 150,000 Buildings for HVAC Operations Transfer Research

Anaïs Berkes

Donna Vakalis

2025-11-10

Proceedings of the 12th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (published)