Guillaume Lajoie

Biography

Guillaume Lajoie is an Associate professor in the Department of Mathematics and Statistics at Université de Montréal and a Core Academic Member of Mila – Quebec Artificial Intelligence Institute. He holds a Canada-CIFAR AI Research Chair, and a Canada Research Chair (CRC) in Neural Computation and Interfacing.

His research is positioned at the intersection of AI and Neuroscience where he develops tools to better understand mechanisms of intelligence common to both biological and artificial systems. His research group's contributions range from advances in multi-scale learning paradigms for large artificial systems, to applications in neurotechnology. Dr. Lajoie is actively involved in responsible AI development efforts, seeking to identify guidelines and best practices for use of AI in research and beyond.

Current Students

Federico Arangath Joseph

Collaborating researcher - ETH Zurich

Stefan Bauer

Independent visiting researcher

Principal supervisor :

Yoshua Bengio

Sangnie Bhardwaj

PhD - Université de Montréal

Co-supervisor :

Hugo Larochelle

Colin Bredenberg

Postdoctorate - Université de Montréal

Co-supervisor :

Blake Richards

Leo Choiniere

PhD - Université de Montréal

Olivier Codol

Postdoctorate - Université de Montréal

Co-supervisor :

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

Leo Gagnon

PhD - Université de Montréal

Skylar Gu

Research Intern - McGill University

Principal supervisor :

Dhanya Sridhar

Juan Guerra

Master's Research - Polytechnique Montréal

Principal supervisor :

Marco Bonizzato

Nanda Harishankar Krishna

PhD - Université de Montréal

Collaborating researcher - Western Washington University (faculty; assistant prof))

Principal supervisor :

PhD - Université de Montréal

Co-supervisor :

Master's Research - Université de Montréal

Co-supervisor :

Dhanya Sridhar

tejaskasetty@gmail.com

Ximeng Mao

PhD - Université de Montréal

Co-supervisor :

Joelle Pineau

Abdel Mfougouon Njupoun

PhD - Université de Montréal

Co-supervisor :

PhD - Université de Montréal

Co-supervisor :

Amine Natik

PhD - Université de Montréal

Co-supervisor :

Guy Wolf

Alexandre Payeur

Collaborating researcher - Université de Montréal

Mohammad Pezeshki

Collaborating researcher

Principal supervisor :

Collaborating Alumni - McGill University

Principal supervisor :

Julia Price

Master's Research - Université de Montréal

Param Raval

Collaborating Alumni - Université de Montréal

Avery Ryoo

Master's Research - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Co-supervisor :

Lune Bellec

Ayesha Vermani

Independent visiting researcher - Champalimeau Institute for the Unknown

Ryan Vogt

Postdoctorate - Université de Montréal

Vivian White

Research Intern - Western Washington University

Co-supervisor :

PhD - Université de Montréal

Machine Learning for the Segmentation of Different Nerve Fibre Activations from Brain-to-body Neural Signals

Blog Posts

Représentation graphique d'un nerf vague

May 21, 2025

Param Raval

Olivier Tessier-Larivière

Pascal Fortier-Poisson

Blake Richards

Guillaume Lajoie

Read the article

June 13, 2024

What Do Synaptic Weight Distributions Tell Us About Learning in the Brain ?

Roman Pogodin

Jonathan Cornford

Arna Ghosh

Gauthier Gidel

Guillaume Lajoie

Blake Richards

Read the article

Publications

Neural networks with optimized single-neuron adaptation uncover biologically plausible regularization

Victor Geadah

Stefan Horoi

Giancarlo Kerg

Guy Wolf

Neurons in the brain have rich and adaptive input-output properties. Features such as heterogeneous f-I curves and spike frequency adaptatio… (see more)n are known to place single neurons in optimal coding regimes when facing changing stimuli. Yet, it is still unclear how brain circuits exploit single-neuron flexibility, and how network-level requirements may have shaped such cellular function. To answer this question, a multi-scaled approach is needed where the computations of single neurons and neural circuits must be considered as a complete system. In this work, we use artificial neural networks to systematically investigate single-neuron input-output adaptive mechanisms, optimized in an end-to-end fashion. Throughout the optimization process, each neuron has the liberty to modify its nonlinear activation function, parametrized to mimic f-I curves of biological neurons, and to learn adaptation strategies to modify activation functions in real-time during a task. We find that such networks show much-improved robustness to noise and changes in input statistics. Importantly, we find that this procedure recovers precise coding strategies found in biological neurons, such as gain scaling and fractional order differentiation/integration. Using tools from dynamical systems theory, we analyze the role of these emergent single-neuron properties and argue that neural diversity and adaptation play an active regularization role, enabling neural circuits to optimally propagate information across time.

2024-12-13

PLOS Computational Biology (published)

Brain-like learning with exponentiated gradients

Jonathan Cornford

Roman Pogodin

Arna Ghosh

Kaiwen Sheng

Brendan A. Bicknell

Olivier Codol

Beverley A. Clark

Blake Richards

2024-10-26

bioRxiv (preprint)

A Complexity-Based Theory of Compositionality

Eric Elmoznino

Thomas Jiralerspong

Yoshua Bengio

2024-10-18

ArXiv (preprint)

A Complexity-Based Theory of Compositionality

Eric Elmoznino

Thomas Jiralerspong

Yoshua Bengio

Compositionality is believed to be fundamental to intelligence. In humans, it underlies the structure of thought, language, and higher-level… (see more) reasoning. In AI, compositional representations can enable a powerful form of out-of-distribution generalization, in which a model systematically adapts to novel combinations of known concepts. However, while we have strong intuitions about what compositionality is, there currently exists no formal definition for it that is measurable and mathematical. Here, we propose such a definition, which we call representational compositionality, that accounts for and extends our intuitions about compositionality. The definition is conceptually simple, quantitative, grounded in algorithmic information theory, and applicable to any representation. Intuitively, representational compositionality states that a compositional representation satisfies three properties. First, it must be expressive. Second, it must be possible to re-describe the representation as a function of discrete symbolic sequences with re-combinable parts, analogous to sentences in natural language. Third, the function that relates these symbolic sequences to the representation, analogous to semantics in natural language, must be simple. Through experiments on both synthetic and real world data, we validate our definition of compositionality and show how it unifies disparate intuitions from across the literature in both AI and cognitive science. We also show that representational compositionality, while theoretically intractable, can be readily estimated using standard deep learning tools. Our definition has the potential to inspire the design of novel, theoretically-driven models that better capture the mechanisms of compositional thought.

2024-10-18

ArXiv (preprint)

In-context learning and Occam's razor

Eric Elmoznino

Tom Marty

Tejas Kasetty

Leo Gagnon

Sarthak Mittal

Mahan Fathi

Dhanya Sridhar

A central goal of machine learning is generalization. While the No Free Lunch Theorem states that we cannot obtain theoretical guarantees fo… (see more)r generalization without further assumptions, in practice we observe that simple models which explain the training data generalize best: a principle called Occam's razor. Despite the need for simple models, most current approaches in machine learning only minimize the training error, and at best indirectly promote simplicity through regularization or architecture design. Here, we draw a connection between Occam's razor and in-context learning: an emergent ability of certain sequence models like Transformers to learn at inference time from past observations in a sequence. In particular, we show that the next-token prediction loss used to train in-context learners is directly equivalent to a data compression technique called prequential coding, and that minimizing this loss amounts to jointly minimizing both the training error and the complexity of the model that was implicitly learned from context. Our theory and the empirical experiments we use to support it not only provide a normative account of in-context learning, but also elucidate the shortcomings of current in-context learning methods, suggesting ways in which they can be improved. We make our code available at https://github.com/3rdCore/PrequentialCode.

2024-10-17

ArXiv (preprint)

openreview.net

Learning Stochastic Rainbow Networks

Vivian White

Muawiz Sajjad Chaudhary

Guy Wolf

Kameron Decker Harris

Random feature models are a popular approach for studying network learning that can capture important behaviors while remaining simpler than… (see more) traditional training. Guth et al. [2024] introduced “rainbow” networks which model the distribution of trained weights as correlated random features conditioned on previous layer activity. Sampling new weights from distributions fit to learned networks led to similar performance in entirely untrained networks, and the observed weight covariance were found to be low rank. This provided evidence that random feature models could be extended to some networks away from initialization, but White et al. [2024] failed to replicate their results in the deeper ResNet18 architecture. Here we ask whether the rainbow formulation can succeed in deeper networks by directly training a stochastic ensemble of random features, which we call stochastic rainbow networks. At every gradient descent iteration, new weights are sampled for all intermediate layers and features aligned layer-wise. We find: (1) this approach scales to deeper models, which outperform shallow networks at large widths; (2) ensembling multiple samples from the stochastic model is better than retraining the classifier head; and (3) low-rank parameterization of the learnable weight covariances can approach the accuracy of full-rank networks. This offers more evidence for rainbow and other structured random feature networks as reduced models of deep learning.

2024-10-10

NeurIPS.cc/2024/Workshop/SciForDL (poster)

openreview.net

Brain-like neural dynamics for behavioral control develop through reinforcement learning

Olivier Codol

Nanda H Krishna

M.G. Perich

During development, neural circuits are shaped continuously as we learn to control our bodies. The ultimate goal of this process is to produ… (see more)ce neural dynamics that enable the rich repertoire of behaviors we perform with our limbs. What begins as a series of “babbles” coalesces into skilled motor output as the brain rapidly learns to control the body. However, the nature of the teaching signal underlying this normative learning process remains elusive. Here, we test two well-established and biologically plausible theories—supervised learning (SL) and reinforcement learning (RL)—that could explain how neural circuits develop the capacity for skilled movements. We trained recurrent neural networks to control a biomechanical model of a primate arm using either SL or RL and compared the resulting neural dynamics to populations of neurons recorded from the motor cortex of monkeys performing the same movements. Intriguingly, only RL-trained networks produced neural activity that matched their biological counterparts in terms of both the geometry and dynamics of population activity. We show that the similarity between RL-trained networks and biological brains depends critically on matching biomechanical properties of the limb. We then demonstrated that monkeys and RL-trained networks, but not SL-trained networks, show a strikingly similar capacity for robust short-term behavioral adaptation to a movement perturbation, indicating a fundamental and general commonality in the neural control policy. Together, our results support the hypothesis that neural dynamics for behavioral control emerge through a process akin to reinforcement learning. The resulting neural circuits offer numerous advantages for adaptable behavioral control over simpler and more efficient learning rules and expand our understanding of how developmental processes shape neural dynamics.

2024-10-06

bioRxiv (preprint)

Brain-like neural dynamics for behavioral control develop through reinforcement learning

Olivier Codol

Nanda H Krishna

M.G. Perich

2024-10-06

bioRxiv (preprint)

The oneirogen hypothesis: modeling the hallucinatory effects of classical psychedelics in terms of replay-dependent plasticity mechanisms

Colin Bredenberg

Fabrice Normandin

Blake Richards

2024-09-30

bioRxiv (preprint)

Latent Representation Learning for Multimodal Brain Activity Translation

Arman Afrasiyabi

Dhananjay Bhaskar

Erica Lindsey Busch

Laurent Caplette

Rahul Singh

Nicholas B Turk-Browne

Smita Krishnaswamy

Neuroscience employs diverse neuroimaging techniques, each offering distinct insights into brain activity, from electrophysiological recordi… (see more)ngs such as EEG, which have high temporal resolution, to hemodynamic modalities such as fMRI, which have increased spatial precision. However, integrating these heterogeneous data sources remains a challenge, which limits a comprehensive understanding of brain function. We present the Spatiotemporal Alignment of Multimodal Brain Activity (SAMBA) framework, which bridges the spatial and temporal resolution gaps across modalities by learning a unified latent space free of modality-specific biases. SAMBA introduces a novel attention-based wavelet decomposition for spectral filtering of electrophysiological recordings, graph attention networks to model functional connectivity between functional brain units, and recurrent layers to capture temporal autocorrelations in brain signal. We show that the training of SAMBA, aside from achieving translation, also learns a rich representation of brain information processing. We showcase this classify external stimuli driving brain activity from the representation learned in hidden layers of SAMBA, paving the way for broad downstream applications in neuroscience research and clinical contexts.

2024-09-27

ArXiv (preprint)

Latent Representation Learning for Multimodal Brain Activity Translation

Arman Afrasiyabi

Dhananjay Bhaskar

Erica L. Busch

Laurent Caplette

Rahul Singh

Nicholas B. Turk-Browne

Smita Krishnaswamy

2024-09-27

ArXiv (preprint)

Accelerating Training with Neuron Interaction and Nowcasting Networks

Boris Knyazev

Abhinav Moudgil

Eugene Belilovsky

Simon Lacoste-Julien

Neural network training can be accelerated when a learnable update rule is used in lieu of classic adaptive optimizers (e.g. Adam). However,… (see more) learnable update rules can be costly and unstable to train and use. Recently, Jang et al. (2023) proposed a simpler approach to accelerate training based on weight nowcaster networks (WNNs). In their approach, Adam is used for most of the optimization steps and periodically, only every few steps, a WNN nowcasts (predicts near future) parameters. We improve WNNs by proposing neuron interaction and nowcasting (NiNo) networks. In contrast to WNNs, NiNo leverages neuron connectivity and graph neural networks to more accurately nowcast parameters. We further show that in some networks, such as Transformers, modeling neuron connectivity accurately is challenging. We address this and other limitations, which allows NiNo to accelerate Adam training by up to 50% in vision and language tasks.

2024-09-06

ArXiv (preprint)