David Rolnick

Biography

David Rolnick is an assistant professor at McGill University’s School of Computer Science, a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair. Rolnick’s work focuses on applications of machine learning to help address climate change. He is the co-founder and chair of Climate Change AI, and scientific co-director of Sustainability in the Digital Age. After completing his PhD in applied mathematics at the Massachusetts Institute of Technology (MIT), he was a NSF Mathematical Sciences Postdoctoral Research Fellow, an NSF Graduate Research Fellow and a Fulbright Scholar. He was named to MIT Technology Review’s “35 Innovators Under 35” in 2021.

Current Students

Aditya Aditya

Collaborating researcher

Collaborating Alumni - McGill University

Collaborating researcher - Cambridge University

Co-supervisor :

Postdoctorate - McGill University

Michael Bunsen

Collaborating researcher - McGill University

Juan Sebastián Cañas

Collaborating researcher

Collaborating researcher - N/A

Co-supervisor :

Yuyan Chen

Master's Research - McGill University

Eya Cherif

Collaborating researcher - Leipzig University

Othmane Echchabi

Master's Research - McGill University

Collaborating researcher

Mohamed Elabbas

Collaborating researcher

Jannik Endres

Collaborating researcher

Jacopo Ghirri

Independent visiting researcher - Politecnico di Milano

Paula Harder

Independent visiting researcher

Collaborating researcher - Université de Montréal

Christina Humer

Collaborating researcher - Johannes Kepler University

Christina Isaicu Isaicu

Collaborating researcher - University of Amsterdam

Gaurav Iyer

Master's Research - McGill University

Julia Kaltenborn

PhD - McGill University

Devin Kwok

PhD - McGill University

Independent visiting researcher - Université de Montréal

Pierre-Louis Lemaire

Collaborating researcher - Polytechnique Montréal Montréal

Principal supervisor :

Alex Hernandez-Garcia

Joshi Manoj

Collaborating researcher - University of East Anglia

David Mickisch

Collaborating researcher

Juan Nathaniel Nathaniel

Collaborating researcher - Columbia university

Postdoctorate - McGill University

Co-supervisor :

Lena Podina

Collaborating researcher - University of Waterloo

Co-supervisor :

Venkatesh Ramesh

Collaborating Alumni - Université de Montréal

Marlena Reil

Master's Research - McGill University

Carla Roesch

Collaborating researcher - Columbia university

Rasha Saha

Master's Research - McGill University

Luca Marie Schmidt

Collaborating researcher - University of Tübingen

Collaborating researcher - Karlsruhe Institute of Technology

Gabriel Tseng

PhD - McGill University

Donna Vakalis

Postdoctorate - Université de Montréal

Principal supervisor :

Collaborating researcher

Catherine Villeneuve

PhD - McGill University

Tiffany Vlaar

Collaborating Alumni - McGill University

Democratizing Access to Satellite Data with AI

Blog Posts

diagram illustrating how the AI foundation model for Earth observation, Galileo, works

October 21, 2025

Gabriel Tseng

David Rolnick

Read the article

Publications

Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

Pratinav Seth

Michelle Lin

Brefo Dwamena Yaw

Jade Boutot

Mary Kang

Millions of abandoned oil and gas wells are scattered across the world, leaching methane into the atmosphere and toxic compounds into the gr… (see more)oundwater. Many of these locations are unknown, preventing the wells from being plugged and their polluting effects averted. Remote sensing is a relatively unexplored tool for pinpointing abandoned wells at scale. We introduce the first large-scale Benchmark dataset for this problem, leveraging high-resolution multi-spectral satellite imagery from Planet Labs. Our curated Dataset comprises over 213,000 wells (abandoned, suspended, and active) from Alberta, a region with especially high well density, sourced from the Alberta Energy Regulator and verified by domain experts. We evaluate baseline algorithms for well detection and segmentation, showing the promise of computer vision approaches but also significant room for improvement.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

proceedings.mlr.press

Galileo: Learning Global & Local Features of Many Remote Sensing Modalities

Gabriel Tseng

Anthony Fuller

Marlena Reil

Henry Herzog

Patrick Beukema

Favyen Bastani

James R Green

Evan Shelhamer

Hannah Kerner

We introduce a highly multimodal transformer to represent many remote sensing modalities - multispectral optical, synthetic aperture radar, … (see more)elevation, weather, pseudo-labels, and more - across space and time. These inputs are useful for diverse remote sensing tasks, such as crop mapping and flood detection. However, learning shared representations of remote sensing data is challenging, given the diversity of relevant data modalities, and because objects of interest vary massively in scale, from small boats (1-2 pixels and fast) to glaciers (thousands of pixels and slow). We present a novel self-supervised learning algorithm that extracts multi-scale features across a flexible set of input modalities through masked modeling. Our dual global and local contrastive losses differ in their targets (deep representations vs. shallow input projections) and masking strategies (structured vs. not). Our Galileo is a single generalist model that outperforms SoTA specialist models for satellite images and pixel time series across eleven benchmarks and multiple tasks.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

proceedings.mlr.press

The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions

Gül Sena Altıntaş

Devin Kwok

Colin Raffel

Neural network training is inherently sensitive to initialization and the randomness induced by stochastic gradient descent. However, it is … (see more)unclear to what extent such effects lead to meaningfully different networks, either in terms of the models’ weights or the underlying functions that were learned. In this work, we show that during the initial "chaotic" phase of training, even extremely small perturbations reliably causes otherwise identical training trajectories to diverge-an effect that diminishes rapidly over training time. We quantify this divergence through (i)

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

proceedings.mlr.press

The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions

Gül Sena Altıntaş

Devin Kwok

Colin Raffel

Neural network training is inherently sensitive to initialization and the randomness induced by stochastic gradient descent. However, it is … (see more)unclear to what extent such effects lead to meaningfully different networks, either in terms of the models' weights or the underlying functions that were learned. In this work, we show that during the initial "chaotic" phase of training, even extremely small perturbations reliably causes otherwise identical training trajectories to diverge-an effect that diminishes rapidly over training time. We quantify this divergence through (i)

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

doi.org

openreview.net

Catalyst GFlowNet for electrocatalyst design: A hydrogen evolution reaction case study

Lena Podina

Christina Humer

Alexandre AGM Duval

Victor Schmidt

Ali Ramlaoui

Shahana Chatterjee

Alex Hernandez-Garcia

Félix Therrien

Efficient and inexpensive energy storage is essential for accelerating the adoption of renewable energy and ensuring a stable supply, despit… (see more)e fluctuations in sources such as wind and solar. Electrocatalysts play a key role in hydrogen energy storage (HES), allowing the energy to be stored as hydrogen. However, the development of affordable and high-performance catalysts for this process remains a significant challenge. We introduce Catalyst GFlowNet, a generative model that leverages machine learning-based predictors of formation and adsorption energy to design crystal surfaces that act as efficient catalysts. We demonstrate the performance of the model through a proof-of-concept application to the hydrogen evolution reaction, a key reaction in HES, for which we successfully identified platinum as the most efficient known catalyst. In future work, we aim to extend this approach to the oxygen evolution reaction, where current optimal catalysts are expensive metal oxides, and open the search space to discover new materials. This generative modeling framework offers a promising pathway for accelerating the search for novel and efficient catalysts.

2025-10-02

ArXiv (preprint)

Catalyst GFlowNet for electrocatalyst design: A hydrogen evolution reaction case study

Lena Podina

Christina Humer

Alexandre AGM Duval

Victor Schmidt

Ali Ramlaoui

Shahana Chatterjee

Alex Hernandez-Garcia

Félix Therrien

2025-10-02

ArXiv (preprint)

Graph Dreamer: Temporal Graph World Models for Sample-Efficient and Generalisable Reinforcement Learning

Anaïs Berkes

Donna Vakalis

2025-09-22

NeurIPS.cc/2025/Workshop/WiML (published)

openreview.net

Identifying birdsong syllables without labelled data

Mélisande Teng

Julien Boussard

Hugo Larochelle

Identifying sequences of syllables within birdsongs is key to tackling a wide array of challenges, including bird individual identification … (see more)and better understanding of animal communication and sensory-motor learning. Recently, machine learning approaches have demonstrated great potential to alleviate the need for experts to label long audio recordings by hand. However, they still typically rely on the availability of labelled data for model training, restricting applicability to a few species and datasets. In this work, we build the first fully unsupervised algorithm to decompose birdsong recordings into sequences of syllables. We first detect syllable events, then cluster them to extract templates -- syllable representations -- before performing matching pursuit to decompose the recording as a sequence of syllables. We evaluate our automatic annotations against human labels on a dataset of Bengalese finch songs and find that our unsupervised method achieves high performance. We also demonstrate that our approach can distinguish individual birds within a species through their unique vocal signatures, for both Bengalese finches and another species, the great tit.

2025-09-22

ArXiv (preprint)

Identifying birdsong syllables without labelled data

Mélisande Teng

Julien Boussard

Hugo Larochelle

2025-09-01

arXiv (published)

doi.org

CISO: Species Distribution Modeling Conditioned on Incomplete Species Observations

Hager Radi

Mélisande Teng

Robin Zbinden

Laura Pollock

Hugo Larochelle

Devis Tuia

Species distribution models (SDMs) are widely used to predict species'geographic distributions, serving as critical tools for ecological res… (see more)earch and conservation planning. Typically, SDMs relate species occurrences to environmental variables representing abiotic factors, such as temperature, precipitation, and soil properties. However, species distributions are also strongly influenced by biotic interactions with other species, which are often overlooked. While some methods partially address this limitation by incorporating biotic interactions, they often assume symmetrical pairwise relationships between species and require consistent co-occurrence data. In practice, species observations are sparse, and the availability of information about the presence or absence of other species varies significantly across locations. To address these challenges, we propose CISO, a deep learning-based method for species distribution modeling Conditioned on Incomplete Species Observations. CISO enables predictions to be conditioned on a flexible number of species observations alongside environmental variables, accommodating the variability and incompleteness of available biotic data. We demonstrate our approach using three datasets representing different species groups: sPlotOpen for plants, SatBird for birds, and a new dataset, SatButterfly, for butterflies. Our results show that including partial biotic information improves predictive performance on spatially separate test sets. When conditioned on a subset of species within the same dataset, CISO outperforms alternative methods in predicting the distribution of the remaining species. Furthermore, we show that combining observations from multiple datasets can improve performance. CISO is a promising ecological tool, capable of incorporating incomplete biotic information and identifying potential interactions between species from disparate taxa.

2025-08-08

ArXiv (preprint)