Portrait of Guillaume Lajoie

Guillaume Lajoie

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, Université de Montréal, Department of Mathematics and Statistics
Visiting Researcher, Google
Research Topics
Computational Neuroscience
Deep Learning
Dynamical Systems
Optimization
Recurrent Neural Networks
Representation Learning

Biography

Guillaume Lajoie is an Associate professor in the Department of Mathematics and Statistics at Université de Montréal and a Core Academic Member of Mila – Quebec Artificial Intelligence Institute. He holds a Canada-CIFAR AI Research Chair, and a Canada Research Chair (CRC) in Neural Computation and Interfacing. He also holds a Health Research Scholar of Fonds de recherche du Québec.

Guillaume Lajoie was previously a postdoctoral fellow at the Max Planck Institute for Dynamics and Self-Organization in Germany and at the University of Washington’s Institute for Neuroengineering. He obtained his PhD from the Department of Applied Mathematics at the University of Washington (Seattle).

His research is positioned at the intersection of AI and Neuroscience where he develops tools to better understand mechanisms of intelligence common to both biological and artificial systems. His research group's contributions range from advances in multi-scale learning paradigms for large artificial systems, to applications in neurotechnology. Dr. Lajoie is actively involved in responsible AI development efforts, seeking to identify guidelines and best practices for use of AI in research and beyond.

Recent work has focused on the development of architectural inductive biases for information propagation in recurrent networks, as well as the development of algorithms and models for bidirectional brain-machine interface optimization.

Current Students

Independent visiting researcher
Principal supervisor :
PhD - Université de Montréal
Co-supervisor :
Postdoctorate - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Postdoctorate - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Master's Research - Polytechnique Montréal
Principal supervisor :
Master's Research - Polytechnique Montréal
Principal supervisor :
Collaborating researcher - Western Washington University (faculty; assistant prof))
Principal supervisor :
Master's Research - Université de Montréal
Co-supervisor :
Collaborating researcher - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher - Université de Montréal
Postdoctorate - McGill University
Principal supervisor :
Collaborating Alumni - Université de Montréal
Master's Research - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - McGill University
Research Intern - Western Washington University
Co-supervisor :
PhD - Université de Montréal

Publications

Synaptic Weight Distributions Depend on the Geometry of Plasticity
Roman Pogodin
Jonathan Cornford
Arna Ghosh
A growing literature in computational neuroscience leverages gradient descent and learning algorithms that approximate it to study synaptic … (see more)plasticity in the brain. However, the vast majority of this work ignores a critical underlying assumption: the choice of distance for synaptic changes - i.e. the geometry of synaptic plasticity. Gradient descent assumes that the distance is Euclidean, but many other distances are possible, and there is no reason that biology necessarily uses Euclidean geometry. Here, using the theoretical tools provided by mirror descent, we show that the distribution of synaptic weights will depend on the geometry of synaptic plasticity. We use these results to show that experimentally-observed log-normal weight distributions found in several brain areas are not consistent with standard gradient descent (i.e. a Euclidean geometry), but rather with non-Euclidean distances. Finally, we show that it should be possible to experimentally test for different synaptic geometries by comparing synaptic weight distributions before and after learning. Overall, our work shows that the current paradigm in theoretical work on synaptic plasticity that assumes Euclidean synaptic geometry may be misguided and that it should be possible to experimentally determine the true geometry of synaptic plasticity in the brain.
Personalized inference for neurostimulation with meta-learning: a case study of vagus nerve stimulation
Ximeng Mao
Yao-Chuan Chang
Stavros Zanos
A benchmark of individual auto-regressive models in a massive fMRI dataset
Fraçois Paugam
Basile Pinsard
Dense functional magnetic resonance imaging datasets open new avenues to create auto-regressive models of brain activity. Individual idiosyn… (see more)crasies are obscured by group models, but can be captured by purely individual models given sufficient amounts of training data. In this study, we compared several deep and shallow individual models on the temporal auto-regression of BOLD time series recorded during a natural video watching task. The best performing models were then analyzed in terms of their data requirements and scaling, subject specificity and the space-time structure of their predicted dynamics. We found the Chebnets, a type of graph convolutional neural network, to be best suited for temporal BOLD auto-regression, closely followed by linear models. Chebnets demonstrated an increase in performance with increasing amounts of data, with no complete saturation at 9 h of training data. Good generalization to other kinds of video stimuli and to resting state data marked the Chebnets’ ability to capture intrinsic brain dynamics rather than only stimulus-specific autocorrelation patterns. Significant subject specificity was found at short prediction time lags. The Chebnets were found to capture lower frequencies at longer prediction time lags, and the spatial correlations in predicted dynamics were found to match traditional functional connectivity networks. Overall, these results demonstrate that large individual fMRI datasets can be used to efficiently train purely individual auto-regressive models of brain activity, and that massive amounts of individual data are required to do so. The excellent performance of the Chebnets likely reflects their ability to combine spatial and temporal interactions on large time scales at a low complexity cost. The non-linearities of the models did not appear as a key advantage. In fact, surprisingly, linear versions of the Chebnets appeared to outperform the original nonlinear ones. Individual temporal auto-regressive models have the potential to improve the predictability of the BOLD signal. This study is based on a massive, publicly-available dataset, which can serve for future benchmarks of individual auto-regressive modeling.
Sufficient conditions for offline reactivation in recurrent neural networks
Nanda H Krishna
Colin Bredenberg
Daniel Levenstein
During periods of quiescence, such as sleep, neural activity in many brain circuits resembles that observed during periods of task engagemen… (see more)t. However, the precise conditions under which task-optimized networks can autonomously reactivate the same network states responsible for online behavior are poorly understood. In this study, we develop a mathematical framework that outlines sufficient conditions for the emergence of neural reactivation in circuits that encode features of smoothly varying stimuli. We demonstrate mathematically that noisy recurrent networks optimized to track environmental state variables using change-based sensory information naturally develop denoising dynamics, which, in the absence of input, cause the network to revisit state configurations observed during periods of online activity. We validate our findings using numerical experiments on two canonical neuroscience tasks: spatial position estimation based on self-motion cues, and head direction estimation based on angular velocity cues. Overall, our work provides theoretical support for modeling offline reactivation as an emergent consequence of task optimization in noisy neural circuits.
Neural manifolds and learning regimes in neural-interface tasks
Alexandre Payeur
Amy L. Orsborn
Large language models: What could they do for neurology?
Discrete, compositional, and symbolic representations through attractor dynamics
Andrew Nam
Eric Elmoznino
Nikolay Malkin
Chen Sun
Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite ca… (see more)pacity despite a finite symbol set. It serves as a useful abstraction for reasoning in both cognitive science and in AI, yet the interface between continuous and symbolic processing is often imposed by fiat at the algorithmic level, such as by means of quantization or a softmax sampling step. In this work, we explore how discretization could be implemented in a more neurally plausible manner through the modeling of attractor dynamics that partition the continuous representation space into basins that correspond to sequences of symbols. Building on established work in attractor networks and introducing novel training methods, we show that imposing structure in the symbolic space can produce compositionality in the attractor-supported representation space of rich sensory inputs. Lastly, we argue that our model exhibits the process of an information bottleneck that is thought to play a role in conscious experience, decomposing the rich information of a sensory input into stable components encoding symbolic information.
A Unified, Scalable Framework for Neural Population Decoding
Mehdi Azabou
Vinam Arora
Venkataramana Ganesh
Ximeng Mao
Santosh B Nachimuthu
Michael Jacob Mendelson
Eva L Dyer
Our ability to use deep learning approaches to decipher neural activity would likely benefit from greater scale, in terms of both the model … (see more)size and the datasets. However, the integration of many neural recordings into one unified model is challenging, as each recording contains the activity of different neurons from different individual animals. In this paper, we introduce a training framework and architecture designed to model the population dynamics of neural activity across diverse, large-scale neural recordings. Our method first tokenizes individual spikes within the dataset to build an efficient representation of neural events that captures the fine temporal structure of neural activity. We then employ cross-attention and a PerceiverIO backbone to further construct a latent tokenization of neural population activities. Utilizing this architecture and training framework, we construct a large-scale multi-session model trained on large datasets from seven nonhuman primates, spanning over 158 different sessions of recording from over 27,373 neural units and over 100 hours of recordings. In a number of different tasks, we demonstrate that our pretrained model can be rapidly adapted to new, unseen sessions with unspecified neuron correspondence, enabling few-shot performance with minimal labels. This work presents a powerful new approach for building deep learning tools to analyze neural data and stakes out a clear path to training at scale for neural decoding models.
Online Bayesian Optimization of Nerve Stimulation
Lorenz Wernisch
Tristan Edwards
Antonin Berthon
Olivier Tessier-Lariviere
Elvijs Sarkans
Myrta Stoukidi
Pascal Fortier-Poisson
Max Pinkney
Michael Thornton
Catherine Hanley
Susannah Lee
Joel Jennings
Ben Appleton
Philip Garsed
Bret Patterson
Buttinger Will
Samuel Gonshaw
Matjaž Jakopec
Sudhakaran Shunmugam
Jorin Mamen … (see 4 more)
Aleksi Tukiainen
Oliver Armitage
Emil Hewage
Neural networks with optimized single-neuron adaptation uncover biologically plausible regularization
Victor Geadah
Stefan Horoi
Giancarlo Kerg
Neurons in the brain have rich and adaptive input-output properties. Features such as heterogeneous f-I curves and spike frequency adaptatio… (see more)n are known to place single neurons in optimal coding regimes when facing changing stimuli. Yet, it is still unclear how brain circuits exploit single-neuron flexibility, and how network-level requirements may have shaped such cellular function. To answer this question, a multi-scaled approach is needed where the computations of single neurons and neural circuits must be considered as a complete system. In this work, we use artificial neural networks to systematically investigate single-neuron input-output adaptive mechanisms, optimized in an end-to-end fashion. Throughout the optimization process, each neuron has the liberty to modify its nonlinear activation function, parametrized to mimic f-I curves of biological neurons, and to learn adaptation strategies to modify activation functions in real-time during a task. We find that such networks show much-improved robustness to noise and changes in input statistics. Importantly, we find that this procedure recovers precise coding strategies found in biological neurons, such as gain scaling and fractional order differentiation/integration. Using tools from dynamical systems theory, we analyze the role of these emergent single-neuron properties and argue that neural diversity and adaptation play an active regularization role, enabling neural circuits to optimally propagate information across time.
Flexible Phase Dynamics for Bio-Plausible Contrastive Learning
Ezekiel Williams
Colin Bredenberg
Exploring Exchangeable Dataset Amortization for Bayesian Posterior Inference
Sarthak Mittal
Niels Leif Bracher
Priyank Jaini
Marcus A Brubaker
Bayesian inference provides a natural way of incorporating uncertainties and different underlying theories when making predictions or analyz… (see more)ing complex systems. However, it requires computationally expensive routines for approximation, which have to be re-run when new data is observed and are thus infeasible to efficiently scale and reuse. In this work, we look at the problem from the perspective of amortized inference to obtain posterior parameter distributions for known probabilistic models. We propose a neural network-based approach that can handle exchangeable observations and amortize over datasets to convert the problem of Bayesian posterior inference into a single forward pass of a network. Our empirical analyses explore various design choices for amortized inference by comparing: (a) our proposed variational objective with forward KL minimization, (b) permutation-invariant architectures like Transformers and DeepSets, and (c) parameterizations of posterior families like diagonal Gaussian and Normalizing Flows. Through our experiments, we successfully apply amortization techniques to estimate the posterior distributions for different domains solely through inference.