Portrait of David Rolnick

David Rolnick

Core Academic Member
Canada CIFAR AI Chair
Assistant Professor, McGill University, School of Computer Science
Adjunct Professor, Université de Montréal, Department of Computer Science and Operations Research
Research Topics
Machine Learning Theory

Biography

David Rolnick is an assistant professor at McGill University’s School of Computer Science, a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair. Rolnick’s work focuses on applications of machine learning to help address climate change. He is the co-founder and chair of Climate Change AI, and scientific co-director of Sustainability in the Digital Age. After completing his PhD in applied mathematics at the Massachusetts Institute of Technology (MIT), he was a NSF Mathematical Sciences Postdoctoral Research Fellow, an NSF Graduate Research Fellow and a Fulbright Scholar. He was named to MIT Technology Review’s “35 Innovators Under 35” in 2021.

Current Students

Collaborating Alumni - McGill University
Collaborating Alumni - Université de Montréal
Collaborating researcher - Cambridge University
Co-supervisor :
Collaborating researcher - The University of Dresden, Helmholtz Centre for Environmental Research Leipzig
Collaborating researcher
Collaborating researcher - National Observatory of Athens
Postdoctorate - McGill University
Collaborating researcher - McGill University
Collaborating researcher - N/A
Co-supervisor :
Master's Research - McGill University
Research Intern - Leipzig University
Collaborating researcher
Collaborating researcher
Independent visiting researcher
Collaborating researcher - Université de Montréal
Collaborating researcher - Johannes Kepler University
Collaborating researcher - University of Amsterdam
Master's Research - McGill University
PhD - McGill University
PhD - McGill University
Collaborating researcher
Collaborating researcher - University of Waterloo
Collaborating researcher
Research Intern - Université de Montréal
Postdoctorate - McGill University
Co-supervisor :
PhD - University of Waterloo
Co-supervisor :
PhD - Université de Montréal
Master's Research - McGill University
Collaborating researcher - University of Tübingen
Collaborating researcher - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)
Co-supervisor :
Collaborating researcher - Karlsruhe Institute of Technology
PhD - McGill University
Postdoctorate - Université de Montréal
Principal supervisor :
Collaborating researcher
PhD - McGill University
Collaborating Alumni - McGill University

Publications

Maximal Initial Learning Rates in Deep ReLU Networks
Gaurav Iyer
Boris Hanin
Training a neural network requires choosing a suitable learning rate, which involves a trade-off between speed and effectiveness of converge… (see more)nce. While there has been considerable theoretical and empirical analysis of how large the learning rate can be, most prior work focuses only on late-stage training. In this work, we introduce the maximal initial learning rate
Semi-Supervised Object Detection for Agriculture
Gabriel Tseng
Krisztina Sinkovics
Tom Watsham
Thomas C. Walters
Bugs in the Data: How ImageNet Misrepresents Biodiversity
Alexandra Luccioni
ImageNet-1k is a dataset often used for benchmarking machine learning (ML) models and evaluating tasks such as image recognition and object … (see more)detection. Wild animals make up 27% of ImageNet-1k but, unlike classes representing people and objects, these data have not been closely scrutinized. In the current paper, we analyze the 13,450 images from 269 classes that represent wild animals in the ImageNet-1k validation set, with the participation of expert ecologists. We find that many of the classes are ill-defined or overlapping, and that 12% of the images are incorrectly labeled, with some classes having >90% of images incorrect. We also find that both the wildlife-related labels and images included in ImageNet-1k present significant geographical and cultural biases, as well as ambiguities such as artificial animals, multiple species in the same image, or the presence of humans. Our findings highlight serious issues with the extensive use of this dataset for evaluating ML systems, the use of such algorithms in wildlife-related tasks, and more broadly the ways in which ML datasets are commonly created and curated.
Deep Networks as Paths on the Manifold of Neural Representations
Richard D Lange
Devin Kwok
Jordan Kyle Matelsky
Xinyue Wang
Konrad Paul Kording
General Purpose AI Systems in the AI Act: Trying to Fit a Square Peg Into a Round Hole
Claire Boine
Normalization Layers Are All That Sharpness-Aware Minimization Needs
Maximilian Mueller
Tiffany Joyce Vlaar
Matthias Hein
Sharpness-aware minimization (SAM) was proposed to reduce sharpness of minima and has been shown to enhance generalization performance in va… (see more)rious settings. In this work we show that perturbing only the affine normalization parameters (typically comprising 0.1% of the total parameters) in the adversarial step of SAM can outperform perturbing all of the parameters.This finding generalizes to different SAM variants and both ResNet (Batch Normalization) and Vision Transformer (Layer Normalization) architectures. We consider alternative sparse perturbation approaches and find that these do not achieve similar performance enhancement at such extreme sparsity levels, showing that this behaviour is unique to the normalization layers. Although our findings reaffirm the effectiveness of SAM in improving generalization performance, they cast doubt on whether this is solely caused by reduced sharpness.
PhAST: Physics-Aware, Scalable, and Task-specific GNNs for Accelerated Catalyst Design
Alexandre AGM Duval
Victor Schmidt
Santiago Miret
Alex Hernandez-Garcia
Mitigating the climate crisis requires a rapid transition towards lower-carbon energy. Catalyst materials play a crucial role in the electro… (see more)chemical reactions involved in numerous industrial processes key to this transition, such as renewable energy storage and electrofuel synthesis. To reduce the energy spent on such activities, we must quickly discover more efficient catalysts to drive electrochemical reactions. Machine learning (ML) holds the potential to efficiently model materials properties from large amounts of data, accelerating electrocatalyst design. The Open Catalyst Project OC20 dataset was constructed to that end. However, ML models trained on OC20 are still neither scalable nor accurate enough for practical applications. In this paper, we propose task-specific innovations applicable to most architectures, enhancing both computational efficiency and accuracy. This includes improvements in (1) the graph creation step, (2) atom representations, (3) the energy prediction head, and (4) the force prediction head. We describe these contributions, referred to as PhAST, and evaluate them thoroughly on multiple architectures. Overall, PhAST improves energy MAE by 4 to 42
Digitalization and the Anthropocene
Felix Creutzig
Daron Acemoglu
Xuemei Bai
Paul N. Edwards
Marie Josefine Hintz
Lynn H. Kaack
Siir Kilkis
Stefanie Kunkel
Amy Luers
Nikola Milojevic-Dupont
Dave Rejeski
Jürgen Renn
Christoph Rosol
Daniela Russ
Thomas Turnbull
Elena Verdolini
Felix Wagner
Charlie Wilson
Aicha Zekar … (see 1 more)
Marius Zumwald
Great claims have been made about the benefits of dematerialization in a digital service economy. However, digitalization has historically i… (see more)ncreased environmental impacts at local and planetary scales, affecting labor markets, resource use, governance, and power relationships. Here we study the past, present, and future of digitalization through the lens of three interdependent elements of the Anthropocene: ( a) planetary boundaries and stability, ( b) equity within and between countries, and ( c) human agency and governance, mediated via ( i) increasing resource efficiency, ( ii) accelerating consumption and scale effects, ( iii) expanding political and economic control, and ( iv) deteriorating social cohesion. While direct environmental impacts matter, the indirect and systemic effects of digitalization are more profoundly reshaping the relationship between humans, technosphere and planet. We develop three scenarios: planetary instability, green but inhumane, and deliberate for the good. We conclude with identifying leverage points that shift human–digital–Earth interactions toward sustainability. Expected final online publication date for the Annual Review of Environment and Resources, Volume 47 is October 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
A portrait of the different configurations between digitally-enabled innovations and climate governance
Pierre J. C. Chuard
Jennifer Garard
Karsten A. Schulz
Nilushi Kumarasinghe
Damon Matthews
Neural Networks as Paths through the Space of Representations
Richard D Lange
Devin Kwok
Jordan Kyle Matelsky
Xinyue Wang
Konrad Paul Kording
Clustering units in neural networks: upstream vs downstream information
Richard D Lange
Konrad Paul Kording
It has been hypothesized that some form of"modular"structure in artificial neural networks should be useful for learning, compositionality, … (see more)and generalization. However, defining and quantifying modularity remains an open problem. We cast the problem of detecting functional modules into the problem of detecting clusters of similar-functioning units. This begs the question of what makes two units functionally similar. For this, we consider two broad families of methods: those that define similarity based on how units respond to structured variations in inputs ("upstream"), and those based on how variations in hidden unit activations affect outputs ("downstream"). We conduct an empirical study quantifying modularity of hidden layer representations of simple feedforward, fully connected networks, across a range of hyperparameters. For each model, we quantify pairwise associations between hidden units in each layer using a variety of both upstream and downstream measures, then cluster them by maximizing their"modularity score"using established tools from network science. We find two surprising results: first, dropout dramatically increased modularity, while other forms of weight regularization had more modest effects. Second, although we observe that there is usually good agreement about clusters within both upstream methods and downstream methods, there is little agreement about the cluster assignments across these two families of methods. This has important implications for representation-learning, as it suggests that finding modular representations that reflect structure in inputs (e.g. disentanglement) may be a distinct goal from learning modular representations that reflect structure in outputs (e.g. compositionality).
On Neural Architecture Inductive Biases for Relational Tasks
Current deep learning approaches have shown good in-distribution generalization performance, but struggle with out-of-distribution generaliz… (see more)ation. This is especially true in the case of tasks involving abstract relations like recognizing rules in sequences, as we find in many intelligence tests. Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems. Building on this work, we further explore and formalize the advantages afforded by 'partitioned' representations of relations and sensory details, and how this inductive bias can help recompose learned relational structure in newly encountered settings. We introduce a simple architecture based on similarity scores which we name Compositional Relational Network (CoRelNet). Using this model, we investigate a series of inductive biases that ensure abstract relations are learned and represented distinctly from sensory data, and explore their effects on out-of-distribution generalization for a series of relational psychophysics tasks. We find that simple architectural choices can outperform existing models in out-of-distribution generalization. Together, these results show that partitioning relational representations from other information streams may be a simple way to augment existing network architectures' robustness when performing out-of-distribution relational computations.