David Rolnick

Biography

David Rolnick is an assistant professor at McGill University’s School of Computer Science, a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair. Rolnick’s work focuses on applications of machine learning to help address climate change. He is the co-founder and chair of Climate Change AI, and scientific co-director of Sustainability in the Digital Age. After completing his PhD in applied mathematics at the Massachusetts Institute of Technology (MIT), he was a NSF Mathematical Sciences Postdoctoral Research Fellow, an NSF Graduate Research Fellow and a Fulbright Scholar. He was named to MIT Technology Review’s “35 Innovators Under 35” in 2021.

Current Students

Benjamin Akera Binen

Collaborating Alumni - McGill University

Collaborating Alumni - Université de Montréal

Collaborating researcher - Cambridge University

Co-supervisor :

Postdoctorate - McGill University

Michael Bunsen

Collaborating researcher - McGill University

Juan Sebastián Cañas

Collaborating researcher

Collaborating researcher - N/A

Co-supervisor :

Yuyan Chen

Master's Research - McGill University

Eya Cherif

Collaborating researcher - Leipzig University

Amna El-Mustafa

Collaborating researcher

Mohamed Elabbas

Collaborating researcher

Paula Harder

Independent visiting researcher

Collaborating researcher - Université de Montréal

Christina Humer

Collaborating researcher - Johannes Kepler University

Christina Isaicu Isaicu

Collaborating researcher - University of Amsterdam

Gaurav Iyer

Master's Research - McGill University

Julia Kaltenborn

PhD - McGill University

Devin Kwok

PhD - McGill University

Collaborating researcher

Collaborating researcher

Felix Andreas Nahrstedt

Research Intern - Université de Montréal

Juan Nathaniel Nathaniel

Collaborating researcher - Columbia university

Postdoctorate - McGill University

Co-supervisor :

Lena Podina

PhD - University of Waterloo

Co-supervisor :

Collaborating Alumni - Université de Montréal

Marlena Reil

Master's Research - McGill University

Carla Roesch

Collaborating researcher - Columbia university

luca.schmidt@uni-tuebingen.de

Luca Marie Schmidt

Collaborating researcher - University of Tübingen

Collaborating researcher

seth.pratinav@gmail.com

Collaborating researcher - Karlsruhe Institute of Technology

Gabriel Tseng

PhD - McGill University

Donna Vakalis

Postdoctorate - Université de Montréal

Principal supervisor :

Collaborating researcher

anna.viklund@mila.quebec

Catherine Villeneuve

PhD - McGill University

Tiffany Vlaar

Collaborating Alumni - McGill University

Publications

FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models

Nikolaos Ioannis Bountos

Arthur Ouaknine

Ioannis Papoutsis

2023-12-15

ArXiv (preprint)

FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models

Nikolaos Ioannis Bountos

Arthur Ouaknine

Forests are an essential part of Earth's ecosystems and natural systems, as well as providing services on which humanity depends, yet they a… (see more)re rapidly changing as a result of land use decisions and climate change. Understanding and mitigating negative effects requires parsing data on forests at global scale from a broad array of sensory modalities, and recently many such problems have been approached using machine learning algorithms for remote sensing. To date, forest-monitoring problems have largely been addressed in isolation. Inspired by the rise of foundation models for computer vision and remote sensing, we here present the first unified Forest Monitoring Benchmark (FoMo-Bench). FoMo-Bench consists of 15 diverse datasets encompassing satellite, aerial, and inventory data, covering a variety of geographical regions, and including multispectral, red-green-blue, synthetic aperture radar (SAR) and LiDAR data with various temporal, spatial and spectral resolutions. FoMo-Bench includes multiple types of forest-monitoring tasks, spanning classification, segmentation, and object detection. To further enhance the diversity of tasks and geographies represented in FoMo-Bench, we introduce a novel global dataset, TalloS, combining satellite imagery with ground-based annotations for tree species classification, encompassing 1,000+ categories across multiple hierarchical taxonomic levels (species, genus, family). Finally, we propose FoMo-Net, a baseline foundation model with the capacity to process any combination of commonly used spectral bands in remote sensing, across diverse ground sampling distances and geographical locations worldwide. This work aims to inspire research collaborations between machine learning and forest biology researchers in exploring scalable multi-modal and multi-task models for forest monitoring. All code and data will be made publicly available.

2023-12-15

ArXiv (preprint)

Towards Causal Representations of Climate Model Data

Julien Boussard

Chandni Nagda

Julia Kaltenborn

Charlotte Emilie Elektra Lange

Philippe Brouillard

Yaniv Gurwicz

Peer Nowack

Climate models, such as Earth system models (ESMs), are crucial for simulating future climate change based on projected Shared Socioeconomic… (see more) Pathways (SSP) greenhouse gas emissions scenarios. While ESMs are sophisticated and invaluable, machine learning-based emulators trained on existing simulation data can project additional climate scenarios much faster and are computationally efficient. However, they often lack generalizability and interpretability. This work delves into the potential of causal representation learning, specifically the \emph{Causal Discovery with Single-parent Decoding} (CDSD) method, which could render climate model emulation efficient \textit{and} interpretable. We evaluate CDSD on multiple climate datasets, focusing on emissions, temperature, and precipitation. Our findings shed light on the challenges, limitations, and promise of using CDSD as a stepping stone towards more interpretable and robust climate model emulation.

2023-12-05

ArXiv (preprint)

Towards Climate Variable Prediction with Conditioned Spatio-Temporal Normalizing Flows

Christina Winkler

2023-11-12

ArXiv (preprint)

SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data

Mélisande Teng

Amna Elmustafa

Benjamin Akera

Hager Radi

Hugo Larochelle

Biodiversity is declining at an unprecedented rate, impacting ecosystem services necessary to ensure food, water, and human health and well-… (see more)being. Understanding the distribution of species and their habitats is crucial for conservation policy planning. However, traditional methods in ecology for species distribution models (SDMs) generally focus either on narrow sets of species or narrow geographical areas and there remain significant knowledge gaps about the distribution of species. A major reason for this is the limited availability of data traditionally used, due to the prohibitive amount of effort and expertise required for traditional field monitoring. The wide availability of remote sensing data and the growing adoption of citizen science tools to collect species observations data at low cost offer an opportunity for improving biodiversity monitoring and enabling the modelling of complex ecosystems. We introduce a novel task for mapping bird species to their habitats by predicting species encounter rates from satellite images, and present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird, considering summer (breeding) and winter seasons. We also provide a dataset in Kenya representing low-data regimes. We additionally provide environmental data and species range maps for each location. We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks. SatBird opens up possibilities for scalably modelling properties of ecosystems worldwide.

2023-11-02

ArXiv (preprint)

OpenForest: A data catalogue for machine learning in forest monitoring

Arthur Ouaknine

Teja Kattenborn

Etienne Lalibert'e

2023-11-01

ArXiv (preprint)

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

Alvaro Carbonero

Alexandre AGM Duval

Victor Schmidt

Santiago Miret

Alex Hernandez-Garcia

The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (see more)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Mat (poster)

ClimateSet: A Large-Scale Climate Model Dataset for Machine Learning

Julia Kaltenborn

Charlotte Emilie Elektra Lange

Venkatesh Ramesh

Philippe Brouillard

Yaniv Gurwicz

Chandni Nagda

Jakob Runge

Peer Nowack

Climate models have been key for assessing the impact of climate change and simulating future climate scenarios. The machine learning (ML) c… (see more)ommunity has taken an increased interest in supporting climate scientists’ efforts on various tasks such as climate model emulation, downscaling, and prediction tasks. Many of those tasks have been addressed on datasets created with single climate models. However, both the climate science and ML communities have suggested that to address those tasks at scale, we need large, consistent, and ML-ready climate model datasets. Here, we introduce ClimateSet, a dataset containing the inputs and outputs of 36 climate models from the Input4MIPs and CMIP6 archives. In addition, we provide a modular dataset pipeline for retrieving and preprocessing additional climate models and scenarios. We showcase the potential of our dataset by using it as a benchmark for ML-based climate model emulation. We gain new insights about the performance and generalization capabilities of the different ML models by analyzing their performance across different climate models. Furthermore, the dataset can be used to train an ML emulator on several climate models instead of just one. Such a “super-emulator” can quickly project new climate change scenarios, complementing existing scenarios already provided to policymakers. We believe ClimateSet will create the basis needed for the ML community to tackle climate-related tasks at scale.

SatBird: a Dataset for Bird Species Distribution Modeling using Remote Sensing and Citizen Science Data

Mélisande Teng

Amna Elmustafa

Benjamin Akera

Hager Radi

Multi-variable Hard Physical Constraints for Climate Model Downscaling

Jose Gonz'alez-Abad

'Alex Hern'andez-Garc'ia

Paula Harder

Jos'e Manuel Guti'errez

2023-08-02

ArXiv (preprint)

FAENet: Frame Averaging Equivariant GNN for Materials Modeling

Alexandre AGM Duval

Victor Schmidt

Alex Hernandez-Garcia

Santiago Miret

Fragkiskos D. Malliaros

Applications of machine learning techniques for materials modeling typically involve functions known to be equivariant or invariant to speci… (see more)fic symmetries. While graph neural networks (GNNs) have proven successful in such tasks, they enforce symmetries via the model architecture, which often reduces their expressivity, scalability and comprehensibility. In this paper, we introduce (1) a flexible framework relying on stochastic frame-averaging (SFA) to make any model E(3)-equivariant or invariant through data transformations. (2) FAENet: a simple, fast and expressive GNN, optimized for SFA, that processes geometric information without any symmetrypreserving design constraints. We prove the validity of our method theoretically and empirically demonstrate its superior accuracy and computational scalability in materials modeling on the OC20 dataset (S2EF, IS2RE) as well as common molecular modeling tasks (QM9, QM7-X). A package implementation is available at https://faenet.readthedocs.io.

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (published)

Hidden Symmetries of ReLU Networks

J. Grigsby

Elisenda Grigsby

Kathryn Lindsey

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (published)