Portrait of Blake Richards

Blake Richards

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, McGill University, School of Computer Science and Department of Neurology and Neurosurgery
Google
Research Topics
Computational Neuroscience
Generative Models
Reinforcement Learning
Representation Learning

Biography

Blake Richards is Research Scientist Manager with the Paradigms of Intelligence team at Google, and an Associate Professor in the School of Computer Science and Department of Neurology and Neurosurgery at McGill University. He is also a Core Faculty Member at Mila.

Richards’ research lies at the intersection of neuroscience and AI. His laboratory investigates universal principles of intelligence that apply to both natural and artificial agents.

He has received several awards for his work, including the NSERC Arthur B. McDonald Fellowship in 2022, the Canadian Association for Neuroscience Young Investigator Award in 2019, and a Canada CIFAR AI Chair in 2018. Richards was a Banting Postdoctoral Fellow at SickKids Hospital from 2011 to 2013.

He obtained his PhD in neuroscience from the University of Oxford in 2010, and his BSc in cognitive science and AI from the University of Toronto in 2004.

Current Students

Research Intern - Université de Montréal
Postdoctorate - McGill University
Postdoctorate - Université de Montréal
Principal supervisor :
PhD - McGill University
PhD - McGill University
Principal supervisor :
PhD - McGill University
Collaborating Alumni - McGill University
Undergraduate - McGill University
PhD - McGill University
Postdoctorate - McGill University
Co-supervisor :
Independent visiting researcher - Université de Montréal
Collaborating Alumni - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University
Collaborating Alumni - McGill University
PhD - McGill University
Co-supervisor :
PhD - McGill University
PhD - McGill University
PhD - Université de Montréal
Principal supervisor :
Master's Research - McGill University
Principal supervisor :
Independent visiting researcher - Université de Montréal
PhD - McGill University
Co-supervisor :
PhD - McGill University
Co-supervisor :
PhD - McGill University
Principal supervisor :
Master's Research - McGill University
Independent visiting researcher - NA
Collaborating Alumni - McGill University
PhD - McGill University
PhD - McGill University
Co-supervisor :
Independent visiting researcher - York University
PhD - Concordia University
Principal supervisor :

Publications

Know Thyself by Knowing Others: Learning Neuron Identity from Population Context
Vinam Arora
Ian J. Knight
Mehdi Azabou
Cole Hurwitz
Joshua H. Siegle
Eva L. Dyer
Identifying the functional identity of individual neurons is essential for interpreting circuit dynamics, yet it remains a major challenge i… (see more)n large-scale _in vivo_ recordings where anatomical and molecular labels are often unavailable. Here we introduce NuCLR, a self-supervised framework that learns context-aware representations of neuron identity by modeling each neuron's role within the broader population. NuCLR employs a spatio-temporal transformer that captures both within-neuron dynamics and across-neuron interactions. It is trained with a sample-wise contrastive objective that encourages temporally-stable and discriminative embeddings. Across multiple open-access datasets, NuCLR outperforms prior methods in both cell type and brain region classification. Critically, it exhibits strong zero-shot generalization to entirely new populations, without any retraining or access to stimulus labels. Furthermore, we demonstrate that our framework scales effectively with data size. Overall, our results demonstrate that modeling population context is crucial for understanding neuron identity and that rich signal for cell-typing and neuron localization is present in neural activity alone.Code available at: https://github.com/nerdslab/nuclr.
RetINaBox: A hands-on learning tool for experimental neuroscience
Brune Bettler
Flavia Arias Armas
Vanessa Bordonaro
Megan Liu
Mingyu Wan
Aude Villemain
Stuart Trenholm
An exciting aspect of neuroscience is developing and testing hypotheses via experimentation. However, due to logistical and financial hurdle… (see more)s, the experiment and discovery component of neuroscience is generally lacking in classroom and outreach settings. To address this issue, here we introduce RetINaBox: a low-cost open-source electronic visual system simulator that provides users with a hands-on tool to discover how the visual system builds feature detectors. RetINaBox features an LED array for generating visual stimuli and a photodiode array that acts as a mosaic of model photoreceptors. Custom software on a Raspberry Pi computer reads out responses from model photoreceptors and allows users to control the polarity and delay of the signal transfer from model photoreceptors to model retinal ganglion cells. Interactive lesson plans are provided, guiding users to discover different types of visual feature detectors—including ON/OFF, center-surround, orientation selective, and direction selective receptive fields—as well as their underlying circuit computations.
Learning to combine top-down context and feed-forward representations under ambiguity with apical and basal dendrites
Guillaume Etter
Busra Tugce Gurbuz
One of the hallmark features of neocortical anatomy is the presence of extensive top-down projections into primary sensory areas, with many … (see more)impinging on the distal apical dendrites of pyramidal neurons. While it is known that they exert a modulatory effect, altering the gain of responses, their functional role remains an active area of research. It is hypothesized that these top-down projections carry contextual information that can help animals to resolve ambiguities in sensory data. One proposed mechanism of contextual integration is a non-linear integration of distinct input streams at apical and basal dendrites of pyramidal neurons. Computationally, however, it is yet to be demonstrated how such an architecture could leverage distinct compartments for flexible contextual integration and sensory processing when both sensory and context signals can be unreliable. Here, we implement an augmented deep neural network with distinct apical and basal compartments that integrates a) contextual information from top-down projections to apical compartments, and b) sensory representations driven by bottom-up projections to basal compartments, via a biophysically inspired rule. In addition, we develop a new multi-scenario contextual integration task using a generative image modeling approach. In addition to generalizing previous contextual integration tasks, it better captures the diversity of scenarios where neither contextual nor sensory information are fully reliable. To solve this task, this model successfully learns to select among integration strategies. We find that our model outperforms those without the "apical prior" when contextual information contradicts sensory input. Altogether, this suggests that the apical prior and biophysically inspired integration rule could be key components necessary for handling the ambiguities that animals encounter in the diverse contexts of the real world.
Multi-Session, Multi-Task Neural Decoding from Distinct Cell-Types and Brain Regions
Mehdi Azabou
Krystal Xuejing Pan
Vinam Arora
Ian Knight
Eva L. Dyer
Recent work has shown that scale is important for improved brain decoding, with more data leading to greater decoding accuracy. However, lar… (see more)ge-scale decoding across many different datasets is challenging because neural circuits are heterogeneous---each brain region contains a unique mix of cellular sub-types, and the responses to different stimuli are diverse across regions and sub-types. It is unknown whether it is possible to pre-train and transfer brain decoding models between distinct tasks, cellular sub-types, and brain regions. To address these questions, we developed a multi-task transformer architecture and trained it on the entirety of the Allen Institute's Brain Observatory dataset. This dataset contains responses from over 100,000 neurons in 6 areas of the brains of mice, observed with two-photon calcium imaging, recorded while the mice observed different types of visual stimuli. Our results demonstrate that transfer is indeed possible -combining data from different sources is beneficial for a number of downstream decoding tasks. As well, we can transfer the model between regions and sub-types, demonstrating that there is in fact common information in diverse circuits that can be extracted by an appropriately designed model. Interestingly, we found that the model's latent representations showed clear distinctions between different brain regions and cellular sub-types, even though it was never given any information about these distinctions. Altogether, our work demonstrates that training a large-scale neural decoding model on diverse data is possible, and this provides a means of studying the differences and similarities between heterogeneous neural circuits.
Top-down feedback matters: Functional impact of brainlike connectivity motifs on audiovisual integration
Artificial neural networks (ANNs) are an important tool for studying neural computation, but many features of the brain are not captured by … (see more)standard ANN architectures. One notable missing feature in most ANN models is top-down feedback, i.e. projections from higher-order layers to lower-order layers in the network. Top-down feedback is ubiquitous in the brain, and it has a unique modulatory impact on activity in neocortical pyramidal neurons. However, we still do not understand its computational role. Here we develop a deep neural network model that captures the core functional properties of top-down feedback in the neocortex, allowing us to construct hierarchical recurrent ANN models that more closely reflect the architecture of the brain. We use this to explore the impact of different hierarchical recurrent architectures on an audiovisual integration task. We find that certain hierarchies, namely those that mimic the architecture of the human brain, impart ANN models with a light visual bias similar to that seen in humans. This bias does not impair performance on the audiovisual tasks. The results further suggest that different configurations of top-down feedback make otherwise identically connected models functionally distinct from each other, and from traditional feedforward and laterally recurrent models. Altogether our findings demonstrate that modulatory top-down feedback is a computationally relevant feature of biological brains, and that incorporating it into ANNs affects their behavior and constrains the solutions it’s likely to discover.
The oneirogen hypothesis: modeling the hallucinatory effects of classical psychedelics in terms of replay-dependent plasticity mechanisms
Abstract Classical psychedelics induce complex visual hallucinations in humans, generating percepts that are co-herent at a … (see more)low level, but which have surreal, dream-like qualities at a high level. While there are many hypotheses as to how classical psychedelics could induce these effects, there are no concrete mechanistic models that capture the variety of observed effects in humans, while remaining consistent with the known pharmacological effects of classical psychedelics on neural circuits. In this work, we propose the “oneirogen hypothesis”, which posits that the perceptual effects of classical psychedelics are a result of their pharmacological actions inducing neural activity states that truly are more similar to dream-like states. We simulate classical psychedelics’ effects via manipulating neural network models trained on perceptual tasks with the Wake-Sleep algorithm. This established machine learning algorithm leverages two activity phases, a perceptual phase (wake) where sensory inputs are encoded, and a generative phase (dream) where the network internally generates activity consistent with stimulus-evoked responses. We simulate the action of psychedelics by partially shifting the model to the ‘Sleep’ state, which entails a greater influence of top-down connections, in line with the impact of psychedelics on apical dendrites. The effects resulting from this manipulation capture a number of experimentally observed phenomena including the emergence of hallucinations, increases in stimulus-conditioned variability, and large increases in synaptic plasticity. We further provide a number of testable predictions which could be used to validate or invalidate our oneirogen hypothesis.
Towards a “universal translator” for neural dynamics at single-cell, single-spike resolution
Yizi Zhang
Yanchen Wang
Donato M. Jiménez-Benetó
Zixuan Wang
Mehdi Azabou
Renee Tung
Olivier Winter
The International Brain Laboratory
Eva Dyer
Liam Paninski
Cole Hurwitz
Neuroscience research has made immense progress over the last decade, but our understanding of the brain remains fragmented and piecemeal: t… (see more)he dream of probing an arbitrary brain region and automatically reading out the information encoded in its neural activity remains out of reach. In this work, we build towards a first foundation model for neural spiking data that can solve a diverse set of tasks across multiple brain areas. We introduce a novel self-supervised modeling approach for population activity in which the model alternates between masking out and reconstructing neural activity across different time steps, neurons, and brain regions. To evaluate our approach, we design unsupervised and supervised prediction tasks using the International Brain Laboratory repeated site dataset, which is comprised of Neuropixels recordings targeting the same brain locations across 48 animals and experimental sessions. The prediction tasks include single-neuron and region-level activity prediction, forward prediction, and behavior decoding. We demonstrate that our multi-task-masking (MtM) approach significantly improves the performance of current state-of-the-art population models and enables multi-task learning. We also show that by training on multiple animals, we can improve the generalization ability of the model to unseen animals, paving the way for a foundation model of the brain at single-cell, single-spike resolution.
Stochastic Wiring of Cell Types Enhances Fitness by Generating Phenotypic Variability
Augustine N. Mavor-Parker
Anthony Zador
The development of neural connectivity is a crucial biological process that gives rise to diverse brain circuits and behaviors. Neural devel… (see more)opment is a stochastic process, but this stochasticity is often treated as a nuisance to overcome rather than as a functional advantage. Here we use a computational model, in which connection probabilities between discrete cell types are genetically specified, to investigate the benefits of stochasticity in the development of neural wiring. We show that this model can be viewed as a generalization of a powerful class of artificial neural networks—Bayesian neural networks—where each network parameter is a sample from a distribution. Our results reveal that stochasticity confers a greater benefit in large networks and variable environments, which may explain its role in organisms with larger brains. Surprisingly, we find that the average fitness over a population of agents is higher than a single agent defined by the average connection probability. Our model reveals how developmental stochasticity, by inducing a form of non-heritable phenotypic variability, can increase the probability that at least some individuals will survive in rapidly changing, unpredictable environments. Our results suggest how stochasticity may be an important feature rather than a bug in neural development.
Sequential predictive learning is a unifying theory for hippocampal representation and replay
The mammalian hippocampus contains a cognitive map that represents an animal’s position in the environment 1 … (see more) and generates offline “replay” 2,3 for the purposes of recall 4 , planning 5,6 , and forming long term memories 7 . Recently, it’s been found that artificial neural networks trained to predict sensory inputs develop spatially tuned cells 8 , aligning with predictive theories of hippocampal function 9–11 . However, whether predictive learning can also account for the ability to produce offline replay is unknown. Here, we find that spatially-tuned cells, which robustly emerge from all forms of predictive learning, do not guarantee the presence of a cognitive map with the ability to generate replay. Offline simulations only emerged in networks that used recurrent connections and head-direction information to predict multi-step observation sequences, which promoted the formation of a continuous attractor reflecting the geometry of the environment. These offline trajectories were able to show wake-like statistics, autonomously replay recently experienced locations, and could be directed by a virtual head direction signal. Further, we found that networks trained to make cyclical predictions of future observation sequences were able to rapidly learn a cognitive map and produced sweeping representations of future positions reminiscent of hippocampal theta sweeps 12 . These results demonstrate how hippocampal-like representation and replay can emerge in neural networks engaged in predictive learning, and suggest that hippocampal theta sequences reflect a circuit that implements a data-efficient algorithm for sequential predictive learning. Together, this framework provides a unifying theory for hippocampal functions and hippocampal-inspired approaches to artificial intelligence.
Fast burst fraction transients convey information independent of the firing rate
Richard Naud
Xingyun Wang
Zachary Friedenberger
Jiyun N Shin
Jean-Claude Beique
Moritz Drüke
Matthew Larkum
Guy Doron
Theories of attention and learning have hypothesized a central role for high-frequency bursting in cognitive functions, but experimental rep… (see more)orts of burst-mediated representations \emph{in vivo} have been limited. Here we used a novel demultiplexing approach by considering a conjunctive burst code. We studied this code \emph{in vivo} while animals learned to report direct electrical stimulation of the somatosensory cortex and found two acquired yet independent representations. One code, the event rate, showed a sparse and succint stiumulus representation and a small modulation upon detection errors. The other code, the burst fraction, correlated more globally with stimulation and more promptly responded to detection errors. Potent and fast modulations of the burst fraction were seen even in cells that were considered unresponsive based on the firing rate. During the later stages of training, this modulation in bursting happened earlier, gradually aligning temporally with the representation in event rate. The alignment of bursting and event rate modulation sharpened the firing rate response, and was strongly associated with behavioral accuracy. Thus a fine-grained separation of spike timing patterns reveals two signals that accompany stimulus representations: an error signal that can be essential to guide learning and a sharpening signal that could implement attention mechanisms.
The feature landscape of visual cortex
Rudi Tong
Ronan da Silva
James Wilsenach
Stuart Trenholm
Understanding computations in the visual system requires a characterization of the distinct feature preferences of neurons in different visu… (see more)al cortical areas. However, we know little about how feature preferences of neurons within a given area relate to that area’s role within the global organization of visual cortex. To address this, we recorded from thousands of neurons across six visual cortical areas in mouse and leveraged generative AI methods combined with closed-loop neuronal recordings to identify each neuron’s visual feature preference. First, we discovered that the mouse’s visual system is globally organized to encode features in a manner invariant to the types of image transformations induced by self-motion. Second, we found differences in the visual feature preferences of each area and that these differences generalized across animals. Finally, we observed that a given area’s collection of preferred stimuli (‘own-stimuli’) drive neurons from the same area more effectively through their dynamic range compared to preferred stimuli from other areas (‘other-stimuli’). As a result, feature preferences of neurons within an area are organized to maximally encode differences among own-stimuli while remaining insensitive to differences among other-stimuli. These results reveal how visual areas work together to efficiently encode information about the external world.
Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL
In real life, success is often contingent upon multiple critical steps that are distant in time from each other and from the final reward. T… (see more)hese critical steps are challenging to identify with traditional reinforcement learning (RL) methods that rely on the Bellman equation for credit assignment. Here, we present a new RL algorithm that uses offline contrastive learning to hone in on these critical steps. This algorithm, which we call Contrastive Retrospection (ConSpec), can be added to any existing RL algorithm. ConSpec learns a set of prototypes for the critical steps in a task by a novel contrastive loss and delivers an intrinsic reward when the current state matches one of the prototypes. The prototypes in ConSpec provide two key benefits for credit assignment: (i) They enable rapid identification of all the critical steps. (ii) They do so in a readily interpretable manner, enabling out-of-distribution generalization when sensory features are altered. Distinct from other contemporary RL approaches to credit assignment, ConSpec takes advantage of the fact that it is easier to retrospectively identify the small set of steps that success is contingent upon (and ignoring other states) than it is to prospectively predict reward at every taken step. ConSpec greatly improves learning in a diverse set of RL tasks. The code is available at the link: https://github.com/sunchipsster1/ConSpec