Portrait of Blake Richards

Blake Richards

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, McGill University, School of Computer Science and Department of Neurology and Neurosurgery
Google
Research Topics
Computational Neuroscience
Generative Models
Reinforcement Learning
Representation Learning

Biography

Blake Richards is an associate professor at the School of Computer Science and in the Department of Neurology and Neurosurgery at McGill University, and a core academic member of Mila – Quebec Artificial Intelligence Institute.

Richards’ research lies at the intersection of neuroscience and AI. His laboratory investigates universal principles of intelligence that apply to both natural and artificial agents.

He has received several awards for his work, including the NSERC Arthur B. McDonald Fellowship in 2022, the Canadian Association for Neuroscience Young Investigator Award in 2019, and a Canada CIFAR AI Chair in 2018. Richards was a Banting Postdoctoral Fellow at SickKids Hospital from 2011 to 2013.

He obtained his PhD in neuroscience from the University of Oxford in 2010, and his BSc in cognitive science and AI from the University of Toronto in 2004.

Current Students

Research Intern - Université de Montréal
Independent visiting researcher - Seoul National University
Postdoctorate - McGill University
Postdoctorate - Université de Montréal
Principal supervisor :
PhD - McGill University
Co-supervisor :
PhD - McGill University
Principal supervisor :
PhD - McGill University
Postdoctorate - McGill University
PhD - McGill University
PhD - McGill University
Independent visiting researcher - Seoul National University
PhD - McGill University
Research Intern - McGill University
Collaborating Alumni
PhD - McGill University
Independent visiting researcher - ETH Zurich
Collaborating researcher - Georgia Tech
Postdoctorate - McGill University
Undergraduate - McGill University
PhD - McGill University
Master's Research - McGill University
PhD - Université de Montréal
Principal supervisor :
Undergraduate - McGill University
Master's Research - McGill University
Independent visiting researcher
Postdoctorate - McGill University
Co-supervisor :
PhD - McGill University
Co-supervisor :
PhD - McGill University
Co-supervisor :
PhD - McGill University
Principal supervisor :
Master's Research - McGill University
Co-supervisor :
Master's Research - McGill University
PhD - McGill University
Master's Research - McGill University
Co-supervisor :
Independent visiting researcher - Seoul National University
Independent visiting researcher - York University
PhD - McGill University
PhD - Concordia University
Principal supervisor :

Publications

Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits
Alexandre Payeur
Jordan Guerguiev
Friedemann Zenke
Richard Naud
Optogenetic activation of parvalbumin and somatostatin interneurons selectively restores theta-nested gamma oscillations and oscillation-induced spike timing-dependent long-term potentiation impaired by amyloid β oligomers
Kyerl Park
Jaedong Lee
Hyun Jae Jang
Michael M Kohl
Jeehyun Kwag
Spike-based causal inference for weight alignment
Jordan Guerguiev
Konrad Paul Kording
In artificial neural networks trained with gradient descent, the weights used for processing stimuli are also used during backward passes to… (see more) calculate gradients. For the real brain to approximate gradients, gradient information would have to be propagated separately, such that one set of synaptic weights is used for processing and another set is used for backward passes. This produces the so-called "weight transport problem" for biological models of learning, where the backward weights used to calculate gradients need to mirror the forward weights used to process stimuli. This weight transport problem has been considered so hard that popular proposals for biological learning assume that the backward weights are simply random, as in the feedback alignment algorithm. However, such random weights do not appear to work well for large networks. Here we show how the discontinuity introduced in a spiking system can lead to a solution to this problem. The resulting algorithm is a special case of an estimator used for causal inference in econometrics, regression discontinuity design. We show empirically that this algorithm rapidly makes the backward weights approximate the forward weights. As the backward weights become correct, this improves learning performance over feedback alignment on tasks such as Fashion-MNIST, SVHN, CIFAR-10 and VOC. Our results demonstrate that a simple learning rule in a spiking network can allow neurons to produce the right backward connections and thus solve the weight transport problem.
Forgetting at biologically realistic levels of neurogenesis in a large-scale hippocampal model
Lina M. Tran
Sheena A. Josselyn
Paul W. Frankland
A deep learning framework for neuroscience
Timothy P. Lillicrap
Philippe Beaudoin
Rafal Bogacz
Amelia Christensen
Claudia Clopath
Rui Ponte Costa
Archy de Berker
Surya Ganguli
Colleen J Gillon
Danijar Hafner
Adam Kepecs
Nikolaus Kriegeskorte
Peter Latham
Grace W. Lindsay
Kenneth D. Miller
Richard Naud
Christopher C. Pack
Panayiota Poirazi … (see 12 more)
Pieter Roelfsema
João Sacramento
Andrew Saxe
Benjamin Scellier
Anna C. Schapiro
Walter Senn
Greg Wayne
Daniel Yamins
Friedemann Zenke
Joel Zylberberg
Denis Therien
Konrad Paul Kording
Dissociating memory accessibility and precision in forgetting
S. Berens
A. Horner
Dendritic solutions to the credit assignment problem
Timothy P. Lillicrap
Irrelevance by inhibition: Learning, computation, and implications for schizophrenia
Nathan Insel
Jordan Guerguiev
Symptoms of schizophrenia may arise from a failure of cortical circuits to filter-out irrelevant inputs. Schizophrenia has also been linked … (see more)to disruptions in cortical inhibitory interneurons, consistent with the possibility that in the normally functioning brain, these cells are in some part responsible for determining which sensory inputs are relevant versus irrelevant. Here, we develop a neural network model that demonstrates how the cortex may learn to ignore irrelevant inputs through plasticity processes affecting inhibition. The model is based on the proposal that the amount of excitatory output from a cortical circuit encodes the expected magnitude of reward or punishment (“relevance”), which can be trained using a temporal difference learning mechanism acting on feedforward inputs to inhibitory interneurons. In the model, irrelevant and blocked stimuli drive lower levels of excitatory activity compared with novel and relevant stimuli, and this difference in activity levels is lost following disruptions to inhibitory units. When excitatory units are connected to a competitive-learning output layer with a threshold, the relevance code can be shown to “gate” both learning and behavioral responses to irrelevant stimuli. Accordingly, the combined network is capable of recapitulating published experimental data linking inhibition in frontal cortex with fear learning and expression. Finally, the model demonstrates how relevance learning can take place in parallel with other types of learning, through plasticity rules involving inhibitory and excitatory components, respectively. Altogether, this work offers a theory of how the cortex learns to selectively inhibit inputs, providing insight into how relevance-assignment problems may emerge in schizophrenia.
Relevance learning via inhibitory plasticity and its implications for schizophrenia
Nathan Insel
Jordan Guerguiev
Symptoms of schizophrenia may arise from a failure of cortical circuits to filter-out irrelevant inputs. Schizophrenia has also been linked … (see more)to disruptions to cortical inhibitory interneurons, consistent with the possibility that in the normally functioning brain, these cells are in some part responsible for determining which inputs are relevant and which irrelevant. Here, we develop an abstract but biologically plausible neural network model that demonstrates how the cortex may learn to ignore irrelevant inputs through plasticity processes affecting inhibition. The model is based on the proposal that the amount of excitatory output from a cortical circuit encodes expected magnitude of reward or punishment (”relevance”), which can be trained using a temporal difference learning mechanism acting on feed-forward inputs to inhibitory interneurons. The model exhibits learned irrelevance and blocking, which become impaired following disruptions to inhibitory units. When excitatory units are connected to a competitive-learning output layer, the relevance code is capable of modulating learning and activity. Accordingly, the combined network is capable of recapitulating published experimental data linking inhibition in frontal cortex with fear learning and expression. Finally, the model demonstrates how relevance learning can take place in parallel with other types of learning, through plasticity rules involving inhibitory and excitatory components respectively. Altogether, this work offers a theory of how the cortex learns to selectively inhibit inputs, providing insight into how relevance-assignment problems may emerge in schizophrenia.
Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures
Sergey Bartunov
Adam Santoro
Luke Marris
Geoffrey Hinton
Timothy P. Lillicrap
The backpropagation of error algorithm (BP) is impossible to implement in a real brain. The recent success of deep networks in machine learn… (see more)ing and AI, however, has inspired proposals for understanding how the brain might learn across multiple layers, and hence how it might approximate BP. As of yet, none of these proposals have been rigorously evaluated on tasks where BP-guided deep learning has proved critical, or in architectures more structured than simple fully-connected networks. Here we present results on scaling up biologically motivated models of deep learning on datasets which need deep networks with appropriate architectures to achieve good performance. We present results on the MNIST, CIFAR-10, and ImageNet datasets and explore variants of target-propagation (TP) and feedback alignment (FA) algorithms, and explore performance in both fully- and locally-connected architectures. We also introduce weight-transport-free variants of difference target propagation (DTP) modified to remove backpropagation from the penultimate layer. Many of these algorithms perform well for MNIST, but for CIFAR and ImageNet we find that TP and FA variants perform significantly worse than BP, especially for networks composed of locally connected units, opening questions about whether new architectures and algorithms are required to scale these approaches. Our results and implementation details help establish baselines for biologically motivated deep learning schemes going forward.