David Rolnick

Biography

David Rolnick is an assistant professor at McGill University’s School of Computer Science, a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair. Rolnick’s work focuses on applications of machine learning to help address climate change. He is the co-founder and chair of Climate Change AI, and scientific co-director of Sustainability in the Digital Age. After completing his PhD in applied mathematics at the Massachusetts Institute of Technology (MIT), he was a NSF Mathematical Sciences Postdoctoral Research Fellow, an NSF Graduate Research Fellow and a Fulbright Scholar. He was named to MIT Technology Review’s “35 Innovators Under 35” in 2021.

Current Students

Benjamin Akera Binen

Collaborating Alumni - McGill University

Collaborating Alumni - Université de Montréal

Collaborating researcher - Cambridge University

Co-supervisor :

Postdoctorate - McGill University

Michael Bunsen

Collaborating researcher - McGill University

Juan Sebastián Cañas

Collaborating researcher

Collaborating researcher - N/A

Co-supervisor :

Yoshua Bengio

Yuyan Chen

Master's Research - McGill University

Eya Cherif

Research Intern - Leipzig University

Amna El-Mustafa

Collaborating researcher

Mohamed Elabbas

Collaborating researcher

Paula Harder

Independent visiting researcher

Collaborating researcher - Université de Montréal

Christina Humer

Collaborating researcher - Johannes Kepler University

Christina Isaicu Isaicu

Collaborating researcher - University of Amsterdam

Gaurav Iyer

Master's Research - McGill University

Julia Kaltenborn

PhD - McGill University

Devin Kwok

PhD - McGill University

Collaborating researcher

Collaborating researcher

Felix Andreas Nahrstedt

Research Intern - Université de Montréal

Juan Nathaniel Nathaniel

Collaborating researcher - Columbia university

Postdoctorate - McGill University

Co-supervisor :

Lena Podina

PhD - University of Waterloo

Co-supervisor :

Collaborating Alumni - Université de Montréal

Marlena Reil

Master's Research - McGill University

Carla Roesch

Collaborating researcher - Columbia university

luca.schmidt@uni-tuebingen.de

Luca Marie Schmidt

Collaborating researcher - University of Tübingen

Collaborating researcher

seth.pratinav@gmail.com

Collaborating researcher - Karlsruhe Institute of Technology

Gabriel Tseng

PhD - McGill University

Donna Vakalis

Postdoctorate - Université de Montréal

Principal supervisor :

Collaborating researcher

anna.viklund@mila.quebec

Catherine Villeneuve

PhD - McGill University

Tiffany Vlaar

Collaborating Alumni - McGill University

Publications

Normalization Layers Are All That Sharpness-Aware Minimization Needs

Maximilian Mueller

Tiffany Joyce Vlaar

Matthias Hein

Sharpness-aware minimization (SAM) was proposed to reduce sharpness of minima and has been shown to enhance generalization performance in va… (see more)rious settings. In this work we show that perturbing only the affine normalization parameters (typically comprising 0.1% of the total parameters) in the adversarial step of SAM can outperform perturbing all of the parameters.This finding generalizes to different SAM variants and both ResNet (Batch Normalization) and Vision Transformer (Layer Normalization) architectures. We consider alternative sparse perturbation approaches and find that these do not achieve similar performance enhancement at such extreme sparsity levels, showing that this behaviour is unique to the normalization layers. Although our findings reaffirm the effectiveness of SAM in improving generalization performance, they cast doubt on whether this is solely caused by reduced sharpness.

PhAST: Physics-Aware, Scalable, and Task-specific GNNs for Accelerated Catalyst Design

Alexandre AGM Duval

Victor Schmidt

Santiago Miret

Yoshua Bengio

Alex Hernandez-Garcia

Mitigating the climate crisis requires a rapid transition towards lower-carbon energy. Catalyst materials play a crucial role in the electro… (see more)chemical reactions involved in numerous industrial processes key to this transition, such as renewable energy storage and electrofuel synthesis. To reduce the energy spent on such activities, we must quickly discover more efficient catalysts to drive electrochemical reactions. Machine learning (ML) holds the potential to efficiently model materials properties from large amounts of data, accelerating electrocatalyst design. The Open Catalyst Project OC20 dataset was constructed to that end. However, ML models trained on OC20 are still neither scalable nor accurate enough for practical applications. In this paper, we propose task-specific innovations applicable to most architectures, enhancing both computational efficiency and accuracy. This includes improvements in (1) the graph creation step, (2) atom representations, (3) the energy prediction head, and (4) the force prediction head. We describe these contributions, referred to as PhAST, and evaluate them thoroughly on multiple architectures. Overall, PhAST improves energy MAE by 4 to 42

2022-11-22

ArXiv (preprint)

Digitalization and the Anthropocene

Felix Creutzig

Daron Acemoglu

Xuemei Bai

Paul N. Edwards

Marie Josefine Hintz

Lynn H. Kaack

Siir Kilkis

Stefanie Kunkel

Amy Luers

Nikola Milojevic-Dupont

Dave Rejeski

Jürgen Renn

Christoph Rosol

Daniela Russ

Thomas Turnbull

Elena Verdolini

Felix Wagner

Charlie Wilson

Aicha Zekar … (see 1 more)

Marius Zumwald

Great claims have been made about the benefits of dematerialization in a digital service economy. However, digitalization has historically i… (see more)ncreased environmental impacts at local and planetary scales, affecting labor markets, resource use, governance, and power relationships. Here we study the past, present, and future of digitalization through the lens of three interdependent elements of the Anthropocene: ( a) planetary boundaries and stability, ( b) equity within and between countries, and ( c) human agency and governance, mediated via ( i) increasing resource efficiency, ( ii) accelerating consumption and scale effects, ( iii) expanding political and economic control, and ( iv) deteriorating social cohesion. While direct environmental impacts matter, the indirect and systemic effects of digitalization are more profoundly reshaping the relationship between humans, technosphere and planet. We develop three scenarios: planetary instability, green but inhumane, and deliberate for the good. We conclude with identifying leverage points that shift human–digital–Earth interactions toward sustainability. Expected final online publication date for the Annual Review of Environment and Resources, Volume 47 is October 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

2022-09-02

Annual Review Environment and Resources (published)

A portrait of the different configurations between digitally-enabled innovations and climate governance

Pierre J. C. Chuard

Jennifer Garard

Karsten A. Schulz

Nilushi Kumarasinghe

Damon Matthews

2022-08-01

Earth System Governance (published)

Neural Networks as Paths through the Space of Representations

Richard D Lange

Devin Kwok

Jordan Kyle Matelsky

Xinyue Wang

Konrad Paul Kording

2022-06-22

ArXiv (preprint)

Clustering units in neural networks: upstream vs downstream information

Richard D Lange

Konrad Paul Kording

It has been hypothesized that some form of"modular"structure in artificial neural networks should be useful for learning, compositionality, … (see more)and generalization. However, defining and quantifying modularity remains an open problem. We cast the problem of detecting functional modules into the problem of detecting clusters of similar-functioning units. This begs the question of what makes two units functionally similar. For this, we consider two broad families of methods: those that define similarity based on how units respond to structured variations in inputs ("upstream"), and those based on how variations in hidden unit activations affect outputs ("downstream"). We conduct an empirical study quantifying modularity of hidden layer representations of simple feedforward, fully connected networks, across a range of hyperparameters. For each model, we quantify pairwise associations between hidden units in each layer using a variety of both upstream and downstream measures, then cluster them by maximizing their"modularity score"using established tools from network science. We find two surprising results: first, dropout dramatically increased modularity, while other forms of weight regularization had more modest effects. Second, although we observe that there is usually good agreement about clusters within both upstream methods and downstream methods, there is little agreement about the cluster assignments across these two families of methods. This has important implications for representation-learning, as it suggests that finding modular representations that reflect structure in inputs (e.g. disentanglement) may be a distinct goal from learning modular representations that reflect structure in outputs (e.g. compositionality).

2022-06-13

TMLR (accepted)

On Neural Architecture Inductive Biases for Relational Tasks

Giancarlo Kerg

Sarthak Mittal

Current deep learning approaches have shown good in-distribution generalization performance, but struggle with out-of-distribution generaliz… (see more)ation. This is especially true in the case of tasks involving abstract relations like recognizing rules in sequences, as we find in many intelligence tests. Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems. Building on this work, we further explore and formalize the advantages afforded by 'partitioned' representations of relations and sensory details, and how this inductive bias can help recompose learned relational structure in newly encountered settings. We introduce a simple architecture based on similarity scores which we name Compositional Relational Network (CoRelNet). Using this model, we investigate a series of inductive biases that ensure abstract relations are learned and represented distinctly from sensory data, and explore their effects on out-of-distribution generalization for a series of relational psychophysics tasks. We find that simple architectural choices can outperform existing models in out-of-distribution generalization. Together, these results show that partitioning relational representations from other information streams may be a simple way to augment existing network architectures' robustness when performing out-of-distribution relational computations.

2022-06-09

ArXiv (preprint)

Aligning artificial intelligence with climate change mitigation

Lynn H. Kaack

Priya L. Donti

Emma Strubell

George Yoshito Kamiya

Felix Creutzig

2022-06-01

Nature Climate Change (published)

Inductive Biases for Relational Tasks

Giancarlo Kerg

Sarthak Mittal

Current deep learning approaches have shown good in-distribution performance but struggle in out-of-distribution settings. This is especiall… (see more)y true in the case of tasks involving abstract relations like recognizing rules in sequences, as required in many intelligence tests. In contrast, our brains are remarkably flexible at such tasks, an attribute that is likely linked to anatomical constraints on computations. Inspired by this, recent work has explored how enforcing that relational representations remain distinct from sensory representations can help artificial systems. Building on this work, we further explore and formalize the advantages afforded by ``partitioned'' representations of relations and sensory details. We investigate inductive biases that ensure abstract relations are learned and represented distinctly from sensory data across several neural network architectures and show that they outperform existing architectures on out-of-distribution generalization for various relational tasks. These results show that partitioning relational representations from other information streams may be a simple way to augment existing network architectures' robustness when performing relational computations.

2022-03-25

ICLR.cc/2022/Workshop/OSC (poster)

Tackling Climate Change with Machine Learning

Priya L. Donti

Lynn H. Kaack

Kelly Kochanski

Alexandre Lacoste

Kris Sankaran

Andrew Slavin Ross

Nikola Milojevic-Dupont

Natasha Jaques

Anna Waldman-Brown

Alexandra Luccioni

Tegan Maharaj

Evan David Sherwin

S. Karthik Mukkavilli

Konrad Paul Kording

Carla P. Gomes

Andrew Y. Ng

Demis Hassabis

John C. Platt

Felix Creutzig … (see 2 more)

Jennifer T Chayes

Yoshua Bengio

Climate change is one of the greatest challenges facing humanity, and we, as machine learning (ML) experts, may wonder how we can help. Here… (see more) we describe how ML can be a powerful tool in reducing greenhouse gas emissions and helping society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by ML, in collaboration with other fields. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the ML community to join the global effort against climate change.

2022-02-07

ACM Computing Surveys (published)

TIML: Task-Informed Meta-Learning for Agriculture

Gabriel Tseng

Hannah Kerner

Labeled datasets for agriculture are extremely spatially imbalanced. When developing algorithms for data-sparse regions, a natural approach … (see more)is to use transfer learning from data-rich regions. While standard transfer learning approaches typically leverage only direct inputs and outputs, geospatial imagery and agricultural data are rich in metadata that can inform transfer learning algorithms, such as the spatial coordinates of data-points or the class of task being learned. We build on previous work exploring the use of meta-learning for agricultural contexts in data-sparse regions and introduce task-informed meta-learning (TIML), an augmentation to model-agnostic meta-learning which takes advantage of task-specific metadata. We apply TIML to crop type classification and yield estimation, and find that TIML significantly improves performance compared to a range of benchmarks in both contexts, across a diversity of model architectures. While we focus on tasks from agriculture, TIML could offer benefits to any meta-learning setup with task-specific metadata, such as classification of geo-tagged images and species distribution modelling.

2022-02-04

ArXiv (preprint)

Understanding the Evolution of Linear Regions in Deep Reinforcement Learning

Setareh Cohan

Nam Hee Gordon Kim

Michiel van de Panne

Policies produced by deep reinforcement learning are typically characterised by their learning curves, but they remain poorly understood in … (see more)many other respects. ReLU-based policies result in a partitioning of the input space into piecewise linear regions. We seek to understand how observed region counts and their densities evolve during deep reinforcement learning using empirical results that span a range of continuous control tasks and policy network dimensions. Intuitively, we may expect that during training, the region density increases in the areas that are frequently visited by the policy, thereby affording fine-grained control. We use recent theoretical and empirical results for the linear regions induced by neural networks in supervised learning settings for grounding and comparison of our results. Empirically, we find that the region density increases only moderately throughout training, as measured along fixed trajectories coming from the final policy. However, the trajectories themselves also increase in length during training, and thus the region densities decrease as seen from the perspective of the current trajectory. Our findings suggest that the complexity of deep reinforcement learning policies does not principally emerge from a significant growth in the complexity of functions observed on-and-around trajectories of the policy.