Portrait de Blake Richards

Blake Richards

Membre académique principal
Chaire en IA Canada-CIFAR
McGill University, École d'informatique et Département de neurologie et de neurochirurgie
Google
Sujets de recherche
Apprentissage de représentations
Apprentissage par renforcement
Modèles génératifs
Neurosciences computationnelles

Biographie

Blake Richards est directeur de recherche au sein de l'équipe Paradigms of Intelligence chez Google et professeur agrégé à l'École d'informatique et au Département de neurologie et de neurochirurgie de l'Université McGill. Il est également et membre académique principal à Mila - Institut québécois d'intelligence artificielle.

Ses recherches se situent à l'intersection des neurosciences et de l'intelligence artificielle. Son laboratoire étudie les principes universels de l'intelligence qui s'appliquent aux agents naturels et artificiels.

Il a reçu plusieurs distinctions pour ses travaux, notamment une bourse Arthur-B.-McDonald du Conseil de recherches en sciences naturelles et en génie du Canada (CRSNG) en 2022, le Prix du jeune chercheur de l'Association canadienne des neurosciences en 2019 et une chaire en IA Canada-CIFAR en 2018. M. Richards a en outre été titulaire d'une bourse postdoctorale Banting à l'hôpital SickKids de 2011 à 2013. Il a obtenu un doctorat en neurosciences de l'Université d'Oxford en 2010 et une licence en sciences cognitives et en IA de l'Université de Toronto en 2004.

Étudiants actuels

Postdoctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - McGill
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Collaborateur·rice alumni - McGill
Baccalauréat - McGill
Doctorat - McGill
Postdoctorat - McGill
Co-superviseur⋅e :
Visiteur de recherche indépendant - UdeM
Collaborateur·rice alumni - McGill
Doctorat - McGill
Doctorat - McGill
Collaborateur·rice alumni - McGill
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Visiteur de recherche indépendant - Université de Montréal
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Visiteur de recherche indépendant - NA
Collaborateur·rice alumni - McGill
Doctorat - McGill
Maîtrise recherche - McGill
Co-superviseur⋅e :
Visiteur de recherche indépendant - York University
Doctorat - Concordia
Superviseur⋅e principal⋅e :

Publications

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning
Seijin Kobayashi
Yanick Schimpf
Maximilian Schlegel
Angelika Steger
Maciej Wolczyk
Johannes Von Oswald
Kaitlin Maile
Rif A. Saurous
James Manyika
Blaise Aguera y Arcas
Alexander Meulemans
João Sacramento
Large-scale autoregressive models pretrained on next-token prediction and finetuned with reinforcement learning (RL) have achieved unprecede… (voir plus)nted success on many problem domains. During RL, these models explore by generating new outputs, one token at a time. However, sampling actions token-by-token can result in highly inefficient learning, particularly when rewards are sparse. Here, we show that it is possible to overcome this problem by acting and exploring within the internal representations of an autoregressive model. Specifically, to discover temporally-abstract actions, we introduce a higher-order, non-causal sequence model whose outputs control the residual stream activations of a base autoregressive model. On grid world and MuJoCo-based tasks with hierarchical structure, we find that the higher-order model learns to compress long activation sequence chunks onto internal controllers. Critically, each controller executes a sequence of behaviorally meaningful actions that unfold over long timescales and are accompanied with a learned termination condition, such that composing multiple controllers over time leads to efficient exploration on novel tasks. We show that direct internal controller reinforcement, a process we term "internal RL", enables learning from sparse rewards in cases where standard RL finetuning fails. Our results demonstrate the benefits of latent action generation and reinforcement in autoregressive models, suggesting internal RL as a promising avenue for realizing hierarchical RL within foundation models.
Embedded Universal Predictive Intelligence: a coherent framework for multi-agent learning
Alexander Meulemans
Rajai Nasser
Maciej Wolczyk
Marissa A. Weis
Seijin Kobayashi
Angelika Steger
Marcus Hutter
James Manyika
Rif A. Saurous
João Sacramento
Blaise Aguera y Arcas
Know Thyself by Knowing Others: Learning Neuron Identity from Population Context
Vinam Arora
Ian J. Knight
Mehdi Azabou
Cole Hurwitz
Joshua H. Siegle
Eva L. Dyer
Identifying the functional identity of individual neurons is essential for interpreting circuit dynamics, yet it remains a major challenge i… (voir plus)n large-scale _in vivo_ recordings where anatomical and molecular labels are often unavailable. Here we introduce NuCLR, a self-supervised framework that learns context-aware representations of neuron identity by modeling each neuron's role within the broader population. NuCLR employs a spatio-temporal transformer that captures both within-neuron dynamics and across-neuron interactions. It is trained with a sample-wise contrastive objective that encourages temporally-stable and discriminative embeddings. Across multiple open-access datasets, NuCLR outperforms prior methods in both cell type and brain region classification. Critically, it exhibits strong zero-shot generalization to entirely new populations, without any retraining or access to stimulus labels. Furthermore, we demonstrate that our framework scales effectively with data size. Overall, our results demonstrate that modeling population context is crucial for understanding neuron identity and that rich signal for cell-typing and neuron localization is present in neural activity alone.Code available at: https://github.com/nerdslab/nuclr.
Why all roads don't lead to Rome: Representation geometry varies across the human visual cortical hierarchy
Biological and artificial intelligence systems navigate the fundamental efficiency-robustness tradeoff for optimal encoding, i.e., they must… (voir plus) efficiently encode numerous attributes of the input space while also being robust to noise. This challenge is particularly evident in hierarchical processing systems like the human brain. With a view towards understanding how systems navigate the efficiency-robustness tradeoff, we turned to a population geometry framework for analyzing representations in the human visual cortex alongside artificial neural networks (ANNs). In the ventral visual stream, we found general-purpose, scale-free representations characterized by a power law-decaying eigenspectrum in most areas. However, in certain higher-order visual areas did not have scale-free representations, indicating that scale-free geometry is not a universal property of the brain. In parallel, ANNs trained with a self-supervised learning objective also exhibited free-free geometry, but not after fine-tune on a specific task. Based on these empirical results and our analytical insights, we posit that a system's representation geometry is not a universal property and instead depends upon the computational objective.
Why all roads don't lead to Rome: Representation geometry varies across the human visual cortical hierarchy
Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?
Anthony GX-Chen
Rob Fergus
Kenneth Marino
Language model (LM) agents are increasingly used as autonomous decision-makers who need to actively gather information to guide their decisi… (voir plus)ons. A crucial cognitive skill for such agents is the efficient exploration and understanding of the causal structure of the world -- key to robust, scientifically grounded reasoning. Yet, it remains unclear whether LMs possess this capability or exhibit systematic biases leading to erroneous conclusions. In this work, we examine LMs' ability to explore and infer causal relationships, using the well-established"Blicket Test"paradigm from developmental psychology. We find that LMs reliably infer the common, intuitive disjunctive causal relationships but systematically struggle with the unusual, yet equally (or sometimes even more) evidenced conjunctive ones. This"disjunctive bias"persists across model families, sizes, and prompting strategies, and performance further declines as task complexity increases. Interestingly, an analogous bias appears in human adults, suggesting that LMs may have inherited deep-seated reasoning heuristics from their training data. To this end, we quantify similarities between LMs and humans, finding that LMs exhibit adult-like inference profiles (but not children-like). Finally, we propose a test-time sampling method which explicitly samples and eliminates hypotheses about causal relationships from the LM. This scalable approach significantly reduces the disjunctive bias and moves LMs closer to the goal of scientific, causally rigorous reasoning.
Learning to combine top-down context and feed-forward representations under ambiguity with apical and basal dendrites
Guillaume Etter
Busra Tugce Gurbuz
The challenge of hidden gifts in multi-agent reinforcement learning
Cooperation between people is not always obvious. Sometimes we benefit from actions that others have taken even when we are unaware that the… (voir plus)y took those actions. For example, if your neighbor chooses not to take a parking spot in front of your house when you are not there, you can benefit, even without being aware that they took this action. These “hidden gifts” represent an interesting challenge for multi-agent reinforcement learning (MARL), since assigning credit to your own actions correctly when the beneficial actions of others are hidden is non-trivial. Here, we study the impact of hidden gifts with a very simple MARL task. In this task, agents in a grid-world environment have individual doors to unlock in order to obtain individual rewards. As well, if all the agents unlock their door the group receives a larger collective reward. However, there is only one key for all of the doors, such that the collective reward can only be obtained when the agents drop the key for others after they use it. Notably, there is nothing to indicate to an agent that the other agents have dropped the key, thus the act of dropping the key for others is a “hidden gift”. We show that several different state-of-the-art RL algorithms, including MARL algorithms, fail to learn how to obtain the collective reward in this simple task. Interestingly, we find that independent model-free policy gradient agents can solve the task when we provide them with information about their action history, but MARL agents still cannot solve the task with action history. Finally, we derive a correction term for these independent agents, inspired by learning aware approaches, which reduces the variance in learning and helps them to converge to collective success more reliably. These results show how credit assignment in multi-agent settings can be particularly challenging in the presence of “hidden gifts”, and demonstrate that learning awareness can benefit these settings
The challenge of hidden gifts in multi-agent reinforcement learning
Cooperation between people is not always obvious. Sometimes we benefit from actions that others have taken even when we are unaware that the… (voir plus)y took those actions. For example, if your neighbor chooses not to take a parking spot in front of your house when you are not there, you can benefit, even without being aware that they took this action. These “hidden gifts” represent an interesting challenge for multi-agent reinforcement learning (MARL), since assigning credit to your own actions correctly when the beneficial actions of others are hidden is non-trivial. Here, we study the impact of hidden gifts with a very simple MARL task. In this task, agents in a grid-world environment have individual doors to unlock in order to obtain individual rewards. As well, if all the agents unlock their door the group receives a larger collective reward. However, there is only one key for all of the doors, such that the collective reward can only be obtained when the agents drop the key for others after they use it. Notably, there is nothing to indicate to an agent that the other agents have dropped the key, thus the act of dropping the key for others is a “hidden gift”. We show that several different state-of-the-art RL algorithms, including MARL algorithms, fail to learn how to obtain the collective reward in this simple task. Interestingly, we find that independent model-free policy gradient agents can solve the task when we provide them with information about their action history, but MARL agents still cannot solve the task with action history. Finally, we derive a correction term for these independent agents, inspired by learning aware approaches, which reduces the variance in learning and helps them to converge to collective success more reliably. These results show how credit assignment in multi-agent settings can be particularly challenging in the presence of “hidden gifts”, and demonstrate that learning awareness can benefit these settings
Tracing the representation geometry of language models from pretraining to post-training
The geometry of representations in a neural network can significantly impact downstream generalization. It is unknown how representation geo… (voir plus)metry changes in large language models (LLMs) over pretraining and post-training. Here, we characterize the evolving geometry of LLM representations using spectral methods (effective rank and eigenspectrum decay). With the OLMo and Pythia model families we uncover a consistent non-monotonic sequence of three distinct geometric phases in pretraining. An initial \warmup phase sees rapid representational compression. This is followed by an "entropy-seeking" phase, characterized by expansion of the representation manifold's effective dimensionality, which correlates with an increase in memorization. Subsequently, a "compression seeking" phase imposes anisotropic consolidation, selectively preserving variance along dominant eigendirections while contracting others, correlating with improved downstream task performance. We link the emergence of these phases to the fundamental interplay of cross-entropy optimization, information bottleneck, and skewed data distribution. Additionally, we find that in post-training the representation geometry is further transformed: Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) correlate with another "entropy-seeking" dynamic to integrate specific instructional or preferential data, reducing out-of-distribution robustness. Conversely, Reinforcement Learning with Verifiable Rewards (RLVR) often exhibits a "compression seeking" dynamic, consolidating reward-aligned behaviors and reducing the entropy in its output distribution. This work establishes the utility of spectral measures of representation geometry for understanding the multiphase learning dynamics within LLMs.
Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?
Anthony GX-Chen
Rob Fergus
Kenneth Marino
Steering CLIP's vision transformer with sparse autoencoders
Ethan Goldfarb
Lorenz Hufe
Yossi Gandelsman
Robert Graham
Wojciech Samek
While vision models are highly capable, their internal mechanisms remain poorly understood-- a challenge which sparse autoencoders (SAEs) ha… (voir plus)ve helped address in language, but which remains underexplored in vision. We address this gap by training SAEs on CLIP's vision transformer and uncover key differences between vision and language processing, including distinct sparsity patterns for SAEs trained across layers and token types. We then provide the first systematic analysis of the steerability of CLIP's vision transformer by introducing metrics to quantify how precisely SAE features can be steered to affect the model's output. We find that 10-15% of neurons and features are steerable, with SAEs providing thousands more steerable features than the base model. Through targeted suppression of SAE features, we then demonstrate improved performance on three vision disentanglement tasks (CelebA, Waterbirds, and typographic attacks), finding optimal disentanglement in middle model layers, and achieving state-of-the-art performance on defense against typographic attacks. We release our CLIP SAE models and code to support future research in vision transformer interpretability.