David Rolnick

Biographie

David Rolnick est professeur adjoint et titulaire d’une chaire en IA Canada-CIFAR à l'École d'informatique de l'Université McGill et membre académique principal de Mila – Institut québécois d’intelligence artificielle. Ses travaux portent sur les applications de l'apprentissage automatique dans la lutte contre le changement climatique. Il est cofondateur et président de Climate Change AI et codirecteur scientifique de Sustainability in the Digital Age. David Rolnick a obtenu un doctorat en mathématiques appliquées du Massachusetts Institute of Technology (MIT). Il a été chercheur postdoctoral en sciences mathématiques à la National Science Foundation (NSF), chercheur diplômé à la NSF et boursier Fulbright. Il a figuré sur la liste des « 35 innovateurs de moins de 35 ans » de la MIT Technology Review en 2021.

Étudiants actuels

Aditya Aditya

Collaborateur·rice de recherche

Collaborateur·rice alumni - McGill

Collaborateur·rice de recherche - Cambridge University

Co-superviseur⋅e :

Postdoctorat - McGill

Michael Bunsen

Collaborateur·rice de recherche - McGill

Juan Sebastián Cañas

Collaborateur·rice de recherche

Collaborateur·rice de recherche - N/A

Co-superviseur⋅e :

Yoshua Bengio

Yuyan Chen

Maîtrise recherche - McGill

Eya Cherif

Collaborateur·rice de recherche - Leipzig University

Othmane Echchabi

Maîtrise recherche - McGill

Collaborateur·rice de recherche

Mohamed Elabbas

Collaborateur·rice de recherche

Jannik Endres

Collaborateur·rice de recherche

Jacopo Ghirri

Visiteur de recherche indépendant - Politecnico di Milano

Paula Harder

Visiteur de recherche indépendant

Collaborateur·rice de recherche - UdeM

Christina Humer

Collaborateur·rice de recherche - Johannes Kepler University

Christina Isaicu Isaicu

Collaborateur·rice de recherche - University of Amsterdam

Gaurav Iyer

Maîtrise recherche - McGill

Doctorat - McGill

Devin Kwok

Doctorat - McGill

Collaborateur·rice de recherche

Visiteur de recherche indépendant - Université de Montréal

Pierre-Louis Lemaire

Collaborateur·rice de recherche - Polytechnique Montréal

Superviseur⋅e principal⋅e :

Alex Hernandez-Garcia

Joshi Manoj

Collaborateur·rice de recherche - University of East Anglia

David Mickisch

Collaborateur·rice de recherche

Juan Nathaniel Nathaniel

Collaborateur·rice de recherche - Columbia university

Postdoctorat - McGill

Co-superviseur⋅e :

Lena Podina

Collaborateur·rice de recherche - University of Waterloo

Co-superviseur⋅e :

Yoshua Bengio

Venkatesh Ramesh

Collaborateur·rice alumni - UdeM

Marlena Reil

Maîtrise recherche - McGill

Carla Roesch

Collaborateur·rice de recherche - Columbia university

Rasha Saha

Maîtrise recherche - McGill

Luca Marie Schmidt

Collaborateur·rice de recherche - University of Tübingen

Collaborateur·rice de recherche - Karlsruhe Institute of Technology

Doctorat - McGill

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche

Doctorat - McGill

Collaborateur·rice alumni - McGill

Démocratiser l'accès aux données satellitaires grâce à l'IA

Billets de blogue

diagram illustrating how the AI foundation model for Earth observation, Galileo, works

21 octobre 2025

par

Gabriel Tseng

David Rolnick

Lire l'article

Publications

Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

Pratinav Seth

Michelle Lin

Brefo Dwamena Yaw

Jade Boutot

Mary Kang

Millions of abandoned oil and gas wells are scattered across the world, leaching methane into the atmosphere and toxic compounds into the gr… (voir plus)oundwater. Many of these locations are unknown, preventing the wells from being plugged and their polluting effects averted. Remote sensing is a relatively unexplored tool for pinpointing abandoned wells at scale. We introduce the first large-scale Benchmark dataset for this problem, leveraging high-resolution multi-spectral satellite imagery from Planet Labs. Our curated Dataset comprises over 213,000 wells (abandoned, suspended, and active) from Alberta, a region with especially high well density, sourced from the Alberta Energy Regulator and verified by domain experts. We evaluate baseline algorithms for well detection and segmentation, showing the promise of computer vision approaches but also significant room for improvement.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (publié)

proceedings.mlr.press

Galileo: Learning Global & Local Features of Many Remote Sensing Modalities

Gabriel Tseng

Anthony Fuller

Marlena Reil

Henry Herzog

Patrick Beukema

Favyen Bastani

James R Green

Evan Shelhamer

Hannah Kerner

We introduce a highly multimodal transformer to represent many remote sensing modalities - multispectral optical, synthetic aperture radar, … (voir plus)elevation, weather, pseudo-labels, and more - across space and time. These inputs are useful for diverse remote sensing tasks, such as crop mapping and flood detection. However, learning shared representations of remote sensing data is challenging, given the diversity of relevant data modalities, and because objects of interest vary massively in scale, from small boats (1-2 pixels and fast) to glaciers (thousands of pixels and slow). We present a novel self-supervised learning algorithm that extracts multi-scale features across a flexible set of input modalities through masked modeling. Our dual global and local contrastive losses differ in their targets (deep representations vs. shallow input projections) and masking strategies (structured vs. not). Our Galileo is a single generalist model that outperforms SoTA specialist models for satellite images and pixel time series across eleven benchmarks and multiple tasks.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (publié)

proceedings.mlr.press

The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions

Gül Sena Altıntaş

Devin Kwok

Colin Raffel

Neural network training is inherently sensitive to initialization and the randomness induced by stochastic gradient descent. However, it is … (voir plus)unclear to what extent such effects lead to meaningfully different networks, either in terms of the models’ weights or the underlying functions that were learned. In this work, we show that during the initial "chaotic" phase of training, even extremely small perturbations reliably causes otherwise identical training trajectories to diverge-an effect that diminishes rapidly over training time. We quantify this divergence through (i)

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (publié)

proceedings.mlr.press

The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions

Gül Sena Altıntaş

Devin Kwok

Colin Raffel

Neural network training is inherently sensitive to initialization and the randomness induced by stochastic gradient descent. However, it is … (voir plus)unclear to what extent such effects lead to meaningfully different networks, either in terms of the models' weights or the underlying functions that were learned. In this work, we show that during the initial "chaotic" phase of training, even extremely small perturbations reliably causes otherwise identical training trajectories to diverge-an effect that diminishes rapidly over training time. We quantify this divergence through (i)

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (publié)

doi.org

openreview.net

Catalyst GFlowNet for electrocatalyst design: A hydrogen evolution reaction case study

Alexandre AGM Duval

Alex Hernandez-Garcia

Félix Therrien

Efficient and inexpensive energy storage is essential for accelerating the adoption of renewable energy and ensuring a stable supply, despit… (voir plus)e fluctuations in sources such as wind and solar. Electrocatalysts play a key role in hydrogen energy storage (HES), allowing the energy to be stored as hydrogen. However, the development of affordable and high-performance catalysts for this process remains a significant challenge. We introduce Catalyst GFlowNet, a generative model that leverages machine learning-based predictors of formation and adsorption energy to design crystal surfaces that act as efficient catalysts. We demonstrate the performance of the model through a proof-of-concept application to the hydrogen evolution reaction, a key reaction in HES, for which we successfully identified platinum as the most efficient known catalyst. In future work, we aim to extend this approach to the oxygen evolution reaction, where current optimal catalysts are expensive metal oxides, and open the search space to discover new materials. This generative modeling framework offers a promising pathway for accelerating the search for novel and efficient catalysts.

2025-10-02

ArXiv (prépublication)

Catalyst GFlowNet for electrocatalyst design: A hydrogen evolution reaction case study

Alexandre AGM Duval

Alex Hernandez-Garcia

Félix Therrien

2025-10-02

ArXiv (prépublication)

Graph Dreamer: Temporal Graph World Models for Sample-Efficient and Generalisable Reinforcement Learning

Anaïs Berkes

Donna Vakalis

Yoshua Bengio

2025-09-22

NeurIPS.cc/2025/Workshop/WiML (publié)

openreview.net

Identifying birdsong syllables without labelled data

Mélisande Teng

Julien Boussard

Hugo Larochelle

Identifying sequences of syllables within birdsongs is key to tackling a wide array of challenges, including bird individual identification … (voir plus)and better understanding of animal communication and sensory-motor learning. Recently, machine learning approaches have demonstrated great potential to alleviate the need for experts to label long audio recordings by hand. However, they still typically rely on the availability of labelled data for model training, restricting applicability to a few species and datasets. In this work, we build the first fully unsupervised algorithm to decompose birdsong recordings into sequences of syllables. We first detect syllable events, then cluster them to extract templates -- syllable representations -- before performing matching pursuit to decompose the recording as a sequence of syllables. We evaluate our automatic annotations against human labels on a dataset of Bengalese finch songs and find that our unsupervised method achieves high performance. We also demonstrate that our approach can distinguish individual birds within a species through their unique vocal signatures, for both Bengalese finches and another species, the great tit.

2025-09-22

ArXiv (prépublication)

Identifying birdsong syllables without labelled data

Mélisande Teng

Julien Boussard

Hugo Larochelle

2025-09-01

arXiv (publié)

doi.org

CISO: Species Distribution Modeling Conditioned on Incomplete Species Observations

Hager Radi

Mélisande Teng

Robin Zbinden

Laura Pollock

Hugo Larochelle

Devis Tuia

Species distribution models (SDMs) are widely used to predict species'geographic distributions, serving as critical tools for ecological res… (voir plus)earch and conservation planning. Typically, SDMs relate species occurrences to environmental variables representing abiotic factors, such as temperature, precipitation, and soil properties. However, species distributions are also strongly influenced by biotic interactions with other species, which are often overlooked. While some methods partially address this limitation by incorporating biotic interactions, they often assume symmetrical pairwise relationships between species and require consistent co-occurrence data. In practice, species observations are sparse, and the availability of information about the presence or absence of other species varies significantly across locations. To address these challenges, we propose CISO, a deep learning-based method for species distribution modeling Conditioned on Incomplete Species Observations. CISO enables predictions to be conditioned on a flexible number of species observations alongside environmental variables, accommodating the variability and incompleteness of available biotic data. We demonstrate our approach using three datasets representing different species groups: sPlotOpen for plants, SatBird for birds, and a new dataset, SatButterfly, for butterflies. Our results show that including partial biotic information improves predictive performance on spatially separate test sets. When conditioned on a subset of species within the same dataset, CISO outperforms alternative methods in predicting the distribution of the remaining species. Furthermore, we show that combining observations from multiple datasets can improve performance. CISO is a promising ecological tool, capable of incorporating incomplete biotic information and identifying potential interactions between species from disparate taxa.

2025-08-08

ArXiv (prépublication)