Portrait of Blake Richards

Blake Richards

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, McGill University, School of Computer Science and Department of Neurology and Neurosurgery
Research Topics
Computational Neuroscience
Generative Models
Reinforcement Learning
Representation Learning

Biography

Blake Richards is an associate professor at the School of Computer Science and in the Department of Neurology and Neurosurgery at McGill University, and a core academic member of Mila – Quebec Artificial Intelligence Institute.

Richards’ research lies at the intersection of neuroscience and AI. His laboratory investigates universal principles of intelligence that apply to both natural and artificial agents.

He has received several awards for his work, including the NSERC Arthur B. McDonald Fellowship in 2022, the Canadian Association for Neuroscience Young Investigator Award in 2019, and a Canada CIFAR AI Chair in 2018. Richards was a Banting Postdoctoral Fellow at SickKids Hospital from 2011 to 2013.

He obtained his PhD in neuroscience from the University of Oxford in 2010, and his BSc in cognitive science and AI from the University of Toronto in 2004.

Current Students

Independent visiting researcher - Seoul National University
Research Intern - McGill University
Postdoctorate - McGill University
Postdoctorate - Université de Montréal
Principal supervisor :
PhD - McGill University
Co-supervisor :
Postdoctorate - Université de Montréal
Principal supervisor :
PhD - McGill University
Principal supervisor :
PhD - McGill University
Postdoctorate - McGill University
Research Intern - McGill University
PhD - McGill University
Independent visiting researcher - Seoul National University
PhD - McGill University
Undergraduate - McGill University
Collaborating Alumni
Independent visiting researcher - University of Oregon
PhD - McGill University
Independent visiting researcher - ETH Zurich
Collaborating researcher - Georgia Tech
Postdoctorate - McGill University
Postdoctorate - McGill University
Undergraduate - McGill University
PhD - McGill University
Master's Research - McGill University
PhD - Université de Montréal
Principal supervisor :
Undergraduate - McGill University
Master's Research - McGill University
Collaborating Alumni
Independent visiting researcher
Postdoctorate - McGill University
Co-supervisor :
PhD - McGill University
Co-supervisor :
PhD - McGill University
Co-supervisor :
PhD - McGill University
Principal supervisor :
Research Intern - University of Oslo
Master's Research - McGill University
Co-supervisor :
Master's Research - McGill University
PhD - McGill University
Master's Research - McGill University
Co-supervisor :
Independent visiting researcher - York University
PhD - McGill University

Publications

Multi-agent cooperation through learning-aware policy gradients
Alexander Meulemans
Seijin Kobayashi
Johannes von Oswald
Nino Scherrer
Eric Elmoznino
Blaise Agüera y Arcas
João Sacramento
Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning. How can we achieve cooperation… (see more) among self-interested, independent learning agents? Promising recent work has shown that in certain tasks cooperation can be established between learning-aware agents who model the learning dynamics of each other. Here, we present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning, which takes into account that other agents are themselves learning through trial and error based on multiple noisy trials. We then leverage efficient sequence models to condition behavior on long observation histories that contain traces of the learning dynamics of other agents. Training long-context policies with our algorithm leads to cooperative behavior and high returns on standard social dilemmas, including a challenging environment where temporally-extended action coordination is required. Finally, we derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.
Top-down feedback matters: Functional impact of brainlike connectivity motifs on audiovisual integration
Mashbayar Tugsbayar
Mingze Li
The oneirogen hypothesis: modeling the hallucinatory effects of classical psychedelics in terms of replay-dependent plasticity mechanisms
Colin Bredenberg
Fabrice Normandin
Stochastic Wiring of Cell Types Enhances Fitness by Generating Phenotypic Variability
Divyansha Lachi
Ann Huang
Augustine N. Mavor-Parker
Arna Ghosh
Anthony Zador
The development of neural connectivity is a crucial biological process that gives rise to diverse brain circuits and behaviors. Neural devel… (see more)opment is a stochastic process, but this stochasticity is often treated as a nuisance to overcome rather than as a functional advantage. Here we use a computational model, in which connection probabilities between discrete cell types are genetically specified, to investigate the benefits of stochasticity in the development of neural wiring. We show that this model can be viewed as a generalization of a powerful class of artificial neural networks—Bayesian neural networks—where each network parameter is a sample from a distribution. Our results reveal that stochasticity confers a greater benefit in large networks and variable environments, which may explain its role in organisms with larger brains. Surprisingly, we find that the average fitness over a population of agents is higher than a single agent defined by the average connection probability. Our model reveals how developmental stochasticity, by inducing a form of non-heritable phenotypic variability, can increase the probability that at least some individuals will survive in rapidly changing, unpredictable environments. Our results suggest how stochasticity may be an important feature rather than a bug in neural development.
Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent
Karolis Jucys
George Adamopoulos
Mehrab Hamidi
Stephanie Milani
Mohammad Reza Samsami
Artem Zholus
Sonia Joseph
Özgür Şimşek
Understanding the mechanisms behind decisions taken by large foundation models in sequential tasks is critical to ensuring that such systems… (see more) operate transparently and safely. However, interpretability methods have not yet been applied extensively to large-scale agents based on reinforcement learning. In this work, we perform exploratory analysis on the Video PreTraining (VPT) Minecraft playing agent, one of the largest open-source vision-based agents. We try to illuminate its reasoning mechanisms by applying various interpretability techniques. First, we analyze the attention mechanism while the agent solves its training task --- crafting a diamond pickaxe. The agent seems to pay attention to the 4 last frames and several key-frames further back. This provides clues as to how it maintains coherence in the task that takes 3-10 minutes, despite the agent's short memory span of only six seconds. Second, we perform various interventions, which help us uncover a worrying case of goal misgeneralization: VPT mistakenly identifies a villager wearing brown clothes as a tree trunk and punches it to death, when positioned stationary under green tree leaves. We demonstrate similar misbehavior in a related agent (STEVE-1), which motivates the use of VPT as a model organism for large-scale vision-based agent interpretability.
Stimulus information guides the emergence of behavior-related signals in primary somatosensory cortex during learning.
Mariangela Panniello
Colleen J Gillon
Roberto Maffulli
Marco Celotto
Stefano Panzeri
Michael M Kohl
Sequential predictive learning is a unifying theory for hippocampal representation and replay
Daniel Levenstein
Aleksei Efremov
Roy Henha Eyono
Adrien Peyrache
The mammalian hippocampus contains a cognitive map that represents an animal’s position in the environment 1 and generates offline “repl… (see more)ay” 2,3 for the purposes of recall 4, planning 5,6, and forming long term memories 7. Recently, it’s been found that artificial neural networks trained to predict sensory inputs develop spatially tuned cells 8, aligning with predictive theories of hippocampal function 9–11. However, whether predictive learning can also account for the ability to produce offline replay is unknown. Here, we find that spatially tuned cells, which robustly emerge from all forms of predictive learning, do not guarantee the presence of a cognitive map with the ability to generate replay. Offline simulations only emerged in networks that used recurrent connections and head-direction information to predict multi-step observation sequences, which promoted the formation of a continuous attractor reflecting the geometry of the environment. These offline trajectories were able to show wake-like statistics, autonomously replay recently experienced locations, and could be directed by a virtual head direction signal. Further, we found that networks trained to make cyclical predictions of future observation sequences were able to rapidly learn a cognitive map and produced sweeping representations of future positions reminiscent of hippocampal theta sweeps 12. These results demonstrate how hippocampal-like representation and replay can emerge in neural networks engaged in predictive learning, and suggest that hippocampal theta sequences reflect a circuit that implements a data-efficient algorithm for sequential predictive learning. Together, this framework provides a unifying theory for hippocampal functions and hippocampal-inspired approaches to artificial intelligence.
Fast burst fraction transients convey information independent of the firing rate
Richard Naud
Xingyun Wang
Zachary Friedenberger
Alexandre Payeur
Jiyun N. Shin
Jean-Claude Béïque
Moritz Drüke
Matthew E. Larkum
Guy Doron
Theories of attention and learning have hypothesized a central role for high-frequency bursting in cognitive functions, but experimental rep… (see more)orts of burst-mediated representations in vivo have been limited. Here we used a novel demultiplexing approach by considering a conjunctive burst code. We studied this code in vivo while animals learned to report direct electrical stimulation of the somatosensory cortex and found two acquired yet independent representations. One code, the event rate, showed a sparse and succint stiumulus representation and a small modulation upon detection errors. The other code, the burst fraction, correlated more globally with stimulation and more promptly responded to detection errors. Bursting modulation was potent and its time course evolved, even in cells that were considered unresponsive based on the firing rate. During the later stages of training, this modulation in bursting happened earlier, gradually aligning temporally with the representation in event rate. The alignment of bursting and event rate modulation sharpened the firing rate response, and was strongly associated behavioral accuracy. Thus a fine-grained separation of spike timing patterns reveals two signals that accompany stimulus representations: an error signal that can be essential to guide learning and a sharpening signal that could implement attention mechanisms.
Sufficient conditions for offline reactivation in recurrent neural networks
Nanda H Krishna
Colin Bredenberg
Daniel Levenstein
During periods of quiescence, such as sleep, neural activity in many brain circuits resembles that observed during periods of task engagemen… (see more)t. However, the precise conditions under which task-optimized networks can autonomously reactivate the same network states responsible for online behavior is poorly understood. In this study, we develop a mathematical framework that outlines sufficient conditions for the emergence of neural reactivation in circuits that encode features of smoothly varying stimuli. We demonstrate mathematically that noisy recurrent networks optimized to track environmental state variables using change-based sensory information naturally develop denoising dynamics, which, in the absence of input, cause the network to revisit state configurations observed during periods of online activity. We validate our findings using numerical experiments on two canonical neuroscience tasks: spatial position estimation based on self-motion cues, and head direction estimation based on angular velocity cues. Overall, our work provides theoretical support for modeling offline reactivation as an emergent consequence of task optimization in noisy neural circuits.
Synaptic Weight Distributions Depend on the Geometry of Plasticity
Roman Pogodin
Jonathan Cornford
Arna Ghosh
A growing literature in computational neuroscience leverages gradient descent and learning algorithms that approximate it to study synaptic … (see more)plasticity in the brain. However, the vast majority of this work ignores a critical underlying assumption: the choice of distance for synaptic changes - i.e. the geometry of synaptic plasticity. Gradient descent assumes that the distance is Euclidean, but many other distances are possible, and there is no reason that biology necessarily uses Euclidean geometry. Here, using the theoretical tools provided by mirror descent, we show that the distribution of synaptic weights will depend on the geometry of synaptic plasticity. We use these results to show that experimentally-observed log-normal weight distributions found in several brain areas are not consistent with standard gradient descent (i.e. a Euclidean geometry), but rather with non-Euclidean distances. Finally, we show that it should be possible to experimentally test for different synaptic geometries by comparing synaptic weight distributions before and after learning. Overall, our work shows that the current paradigm in theoretical work on synaptic plasticity that assumes Euclidean synaptic geometry may be misguided and that it should be possible to experimentally determine the true geometry of synaptic plasticity in the brain.
Sufficient conditions for offline reactivation in recurrent neural networks
Nanda H Krishna
Colin Bredenberg
Daniel Levenstein
During periods of quiescence, such as sleep, neural activity in many brain circuits resembles that observed during periods of task engagemen… (see more)t. However, the precise conditions under which task-optimized networks can autonomously reactivate the same network states responsible for online behavior are poorly understood. In this study, we develop a mathematical framework that outlines sufficient conditions for the emergence of neural reactivation in circuits that encode features of smoothly varying stimuli. We demonstrate mathematically that noisy recurrent networks optimized to track environmental state variables using change-based sensory information naturally develop denoising dynamics, which, in the absence of input, cause the network to revisit state configurations observed during periods of online activity. We validate our findings using numerical experiments on two canonical neuroscience tasks: spatial position estimation based on self-motion cues, and head direction estimation based on angular velocity cues. Overall, our work provides theoretical support for modeling offline reactivation as an emergent consequence of task optimization in noisy neural circuits.
Addressing Sample Inefficiency in Multi-View Representation Learning
Kumar Krishna Agrawal
Arna Ghosh