Portrait of David Rolnick

David Rolnick

Core Academic Member
Canada CIFAR AI Chair
Assistant Professor, McGill University, School of Computer Science
Adjunct Professor, Université de Montréal, Department of Computer Science and Operations Research
Research Topics
AI and Sustainability
AI for Science
Applied Machine Learning
Biodiversity
Building Energy Management Systems
Climate
Climate Change
Climate Change AI
Climate Modeling
Climate Science
Climate Variable Downscaling
Computer Vision
Conservation Technology
Energy Systems
Forest Monitoring
Machine Learning and Climate Change
Machine Learning for Physical Sciences
Machine Learning in Climate Modeling
Machine Learning Theory
Out-of-Distribution (OOD) Detection
Remote Sensing
Satellite Remote Sensing
Time Series Forecasting
Vegetation

Biography

David Rolnick is an assistant professor at McGill University’s School of Computer Science, a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair. Rolnick’s work focuses on applications of machine learning to help address climate change. He is the co-founder and chair of Climate Change AI, and scientific co-director of Sustainability in the Digital Age. After completing his PhD in applied mathematics at the Massachusetts Institute of Technology (MIT), he was a NSF Mathematical Sciences Postdoctoral Research Fellow, an NSF Graduate Research Fellow and a Fulbright Scholar. He was named to MIT Technology Review’s “35 Innovators Under 35” in 2021.

Current Students

Collaborating researcher
Collaborating Alumni - McGill University
Collaborating researcher - Cambridge University
Co-supervisor :
Postdoctorate - McGill University
Collaborating researcher - McGill University
Collaborating researcher - N/A
Co-supervisor :
PhD - McGill University
Collaborating researcher - Leipzig University
Master's Research - McGill University
Collaborating researcher
Collaborating researcher
Collaborating researcher
Independent visiting researcher - Politecnico di Milano
Independent visiting researcher
Collaborating researcher - Johannes Kepler University
Collaborating researcher - University of Amsterdam
Master's Research - McGill University
PhD - McGill University
PhD - McGill University
Independent visiting researcher - Université de Montréal
Collaborating researcher - Polytechnique Montréal Montréal
Principal supervisor :
Collaborating researcher - University of East Anglia
Collaborating researcher
Collaborating researcher - Columbia university
Postdoctorate - McGill University
Co-supervisor :
Collaborating researcher - University of Waterloo
Co-supervisor :
Collaborating Alumni - Université de Montréal
Master's Research - McGill University
Collaborating researcher - Columbia university
Master's Research - McGill University
Collaborating researcher - University of Tübingen
Independent visiting researcher - Karlsruhe Institute of Technology
Independent visiting researcher
Collaborating researcher - Karlsruhe Institute of Technology
PhD - McGill University
Collaborating Alumni - Université de Montréal
Principal supervisor :
Collaborating researcher
PhD - McGill University
Collaborating researcher - Technical University of Munich

Publications

A landmark environmental law looks ahead
Robert L. Fischman
J. B. Ruhl
Brenna R. Forester
Tanya M. Lama
Marty Kardos
Grethel Aguilar Rojas
Nicholas A. Robinson
Patrick D. Shirey
Gary A. Lamberti
Amy W. Ando
Stephen Palumbi
Michael Wara
Mark W. Schwartz
Matthew A. Williamson
Tanya Berger-Wolf
Sara Beery
Justin Kitzes
David Thau
Devis Tuia … (see 8 more)
Daniel Rubenstein
Caleb R. Hickman
Julie Thorstenson
Gregory E. Kaebnick
James P. Collins
Athmeya Jayaram
Thomas Deleuil
Ying Zhao
In late December 1973, the United States enacted what some would come to call “the pitbull of environmental laws.” In the 50 years since… (see more), the formidable regulatory teeth of the Endangered Species Act (ESA) have been credited with considerable successes, obliging agencies to draw upon the best available science to protect species and habitats. Yet human pressures continue to push the planet toward extinctions on a massive scale. With that prospect looming, and with scientific understanding ever changing, Science invited experts to discuss how the ESA has evolved and what its future might hold. —Brad Wible
FoMo: Multi-Modal, Multi-Scale and Multi-Task Remote Sensing Foundation Models for Forest Monitoring
Forests are vital to ecosystems, supporting biodiversity and essential services, but are rapidly changing due to land use and climate change… (see more). Understanding and mitigating negative effects requires parsing data on forests at global scale from a broad array of sensory modalities, and using them in diverse forest monitoring applications. Such diversity in data and applications can be effectively addressed through the development of a large, pre-trained foundation model that serves as a versatile base for various downstream tasks. However, remote sensing modalities, which are an excellent fit for several forest management tasks, are particularly challenging considering the variation in environmental conditions, object scales, image acquisition modes, spatio-temporal resolutions, etc. With that in mind, we present the first unified Forest Monitoring Benchmark (FoMo-Bench), carefully constructed to evaluate foundation models with such flexibility. FoMo-Bench consists of 15 diverse datasets encompassing satellite, aerial, and inventory data, covering a variety of geographical regions, and including multispectral, red-green-blue, synthetic aperture radar and LiDAR data with various temporal, spatial and spectral resolutions. FoMo-Bench includes multiple types of forest-monitoring tasks, spanning classification, segmentation, and object detection. To enhance task and geographic diversity in FoMo-Bench, we introduce TalloS, a global dataset combining satellite imagery with ground-based annotations for tree species classification across 1,000+ categories and hierarchical taxonomic levels. Finally, we propose FoMo-Net, a pre-training framework to develop foundation models with the capacity to process any combination of commonly used modalities and spectral bands in remote sensing.
Towards Causal Representations of Climate Model Data
Charlotte Emilie Elektra Lange
Yaniv Gurwicz
Peer Nowack
Climate models, such as Earth system models (ESMs), are crucial for simulating future climate change based on projected Shared Socioeconomic… (see more) Pathways (SSP) greenhouse gas emissions scenarios. While ESMs are sophisticated and invaluable, machine learning-based emulators trained on existing simulation data can project additional climate scenarios much faster and are computationally efficient. However, they often lack generalizability and interpretability. This work delves into the potential of causal representation learning, specifically the \emph{Causal Discovery with Single-parent Decoding} (CDSD) method, which could render climate model emulation efficient \textit{and} interpretable. We evaluate CDSD on multiple climate datasets, focusing on emissions, temperature, and precipitation. Our findings shed light on the challenges, limitations, and promise of using CDSD as a stepping stone towards more interpretable and robust climate model emulation.
Towards Climate Variable Prediction with Conditioned Spatio-Temporal Normalizing Flows
SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data
Mélisande Teng
Amna Elmustafa
Benjamin Akera
Hager Radi Abdelwahed
OpenForest: a data catalog for machine learning in forest monitoring
Forests play a crucial role in Earth's system processes and provide a suite of social and economic ecosystem services, but are significantly… (see more) impacted by human activities, leading to a pronounced disruption of the equilibrium within ecosystems. Advancing forest monitoring worldwide offers advantages in mitigating human impacts and enhancing our comprehension of forest composition, alongside the effects of climate change. While statistical modeling has traditionally found applications in forest biology, recent strides in machine learning and computer vision have reached important milestones using remote sensing data, such as tree species identification, tree crown segmentation and forest biomass assessments. For this, the significance of open access data remains essential in enhancing such data-driven algorithms and methodologies. Here, we provide a comprehensive and extensive overview of 86 open access forest datasets across spatial scales, encompassing inventories, ground-based, aerial-based, satellite-based recordings, and country or world maps. These datasets are grouped in OpenForest, a dynamic catalogue open to contributions that strives to reference all available open access forest datasets. Moreover, in the context of these datasets, we aim to inspire research in machine learning applied to forest biology by establishing connections between contemporary topics, perspectives and challenges inherent in both domains. We hope to encourage collaborations among scientists, fostering the sharing and exploration of diverse datasets through the application of machine learning methods for large-scale forest monitoring. OpenForest is available at https://github.com/RolnickLab/OpenForest .
On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions
The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (see more)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.
ClimateSet: A Large-Scale Climate Model Dataset for Machine Learning
Charlotte Emilie Elektra Lange
Yaniv Gurwicz
Jakob Runge
Peer Nowack
Climate models have been key for assessing the impact of climate change and simulating future climate scenarios. The machine learning (ML) c… (see more)ommunity has taken an increased interest in supporting climate scientists' efforts on various tasks such as climate model emulation, downscaling, and prediction tasks. Many of those tasks have been addressed on datasets created with single climate models. However, both the climate science and ML communities have suggested that to address those tasks at scale, we need large, consistent, and ML-ready climate model datasets. Here, we introduce ClimateSet, a dataset containing the inputs and outputs of 36 climate models from the Input4MIPs and CMIP6 archives. In addition, we provide a modular dataset pipeline for retrieving and preprocessing additional climate models and scenarios. We showcase the potential of our dataset by using it as a benchmark for ML-based climate model emulation. We gain new insights about the performance and generalization capabilities of the different ML models by analyzing their performance across different climate models. Furthermore, the dataset can be used to train an ML emulator on several climate models instead of just one. Such a "super emulator" can quickly project new climate change scenarios, complementing existing scenarios already provided to policymakers. We believe ClimateSet will create the basis needed for the ML community to tackle climate-related tasks at scale.
Multi-variable Hard Physical Constraints for Climate Model Downscaling
Jose Gonz'alez-Abad
'Alex Hern'andez-Garc'ia
Jos'e Manuel Guti'errez
FAENet: Frame Averaging Equivariant GNN for Materials Modeling
Alex Hernandez Garcia
Santiago Miret
Fragkiskos D. Malliaros
Applications of machine learning techniques for materials modeling typically involve functions known to be equivariant or invariant to speci… (see more)fic symmetries. While graph neural networks (GNNs) have proven successful in such tasks, they enforce symmetries via the model architecture, which often reduces their expressivity, scalability and comprehensibility. In this paper, we introduce (1) a flexible framework relying on stochastic frame-averaging (SFA) to make any model E(3)-equivariant or invariant through data transformations. (2) FAENet: a simple, fast and expressive GNN, optimized for SFA, that processes geometric information without any symmetrypreserving design constraints. We prove the validity of our method theoretically and empirically demonstrate its superior accuracy and computational scalability in materials modeling on the OC20 dataset (S2EF, IS2RE) as well as common molecular modeling tasks (QM9, QM7-X). A package implementation is available at https://faenet.readthedocs.io.
Hidden Symmetries of ReLU Networks
J. Elisenda Grigsby
J. Elisenda Grigsby
Kathryn Lindsey
The parameter space for any fixed architecture of feedforward ReLU neural networks serves as a proxy during training for the associated clas… (see more)s of functions - but how faithful is this representation? It is known that many different parameter settings can determine the same function. Moreover, the degree of this redundancy is inhomogeneous: for some networks, the only symmetries are permutation of neurons in a layer and positive scaling of parameters at a neuron, while other networks admit additional hidden symmetries. In this work, we prove that, for any network architecture where no layer is narrower than the input, there exist parameter settings with no hidden symmetries. We also describe a number of mechanisms through which hidden symmetries can arise, and empirically approximate the functional dimension of different network architectures at initialization. These experiments indicate that the probability that a network has no hidden symmetries decreases towards 0 as depth increases, while increasing towards 1 as width and input dimension increase.
Maximal Initial Learning Rates in Deep ReLU Networks
Training a neural network requires choosing a suitable learning rate, which involves a trade-off between speed and effectiveness of converge… (see more)nce. While there has been considerable theoretical and empirical analysis of how large the learning rate can be, most prior work focuses only on late-stage training. In this work, we introduce the maximal initial learning rate