Portrait of Guillaume Lajoie

Guillaume Lajoie

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, Université de Montréal, Department of Mathematics and Statistics
Visiting Researcher, Google
Research Topics
Computational Neuroscience
Deep Learning
Dynamical Systems
Optimization
Recurrent Neural Networks
Representation Learning

Biography

Guillaume Lajoie is an Associate professor in the Department of Mathematics and Statistics at Université de Montréal and a Core Academic Member of Mila – Quebec Artificial Intelligence Institute. He holds a Canada-CIFAR AI Research Chair, and a Canada Research Chair (CRC) in Neural Computation and Interfacing. He also holds a Health Research Scholar of Fonds de recherche du Québec.

Guillaume Lajoie was previously a postdoctoral fellow at the Max Planck Institute for Dynamics and Self-Organization in Germany and at the University of Washington’s Institute for Neuroengineering. He obtained his PhD from the Department of Applied Mathematics at the University of Washington (Seattle).

His research is positioned at the intersection of AI and Neuroscience where he develops tools to better understand mechanisms of intelligence common to both biological and artificial systems. His research group's contributions range from advances in multi-scale learning paradigms for large artificial systems, to applications in neurotechnology. Dr. Lajoie is actively involved in responsible AI development efforts, seeking to identify guidelines and best practices for use of AI in research and beyond.

Recent work has focused on the development of architectural inductive biases for information propagation in recurrent networks, as well as the development of algorithms and models for bidirectional brain-machine interface optimization.

Current Students

Independent visiting researcher
Principal supervisor :
PhD - Université de Montréal
Co-supervisor :
Postdoctorate - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Postdoctorate - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Master's Research - Polytechnique Montréal
Principal supervisor :
Master's Research - Polytechnique Montréal
Principal supervisor :
PhD - Université de Montréal
Collaborating researcher - Western Washington University (faculty; assistant prof))
Principal supervisor :
Professional Master's - Université de Montréal
Collaborating researcher - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher - Université de Montréal
Postdoctorate - McGill University
Principal supervisor :
Master's Research - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - McGill University
Research Intern - Western Washington University
Co-supervisor :
PhD - Université de Montréal

Publications

Expressivity of Neural Networks with Random Weights and Learned Biases
Ezekiel Williams
Avery Hee-Woon Ryoo
Thomas Jiralerspong
Alexandre Payeur
Luca Mazzucato
Using neural biomarkers to personalize dosing of vagus nerve stimulation
Antonin Berthon
Lorenz Wernisch
Myrta Stoukidi
Michael Thornton
Olivier Tessier-Lariviere
Pascal Fortier-Poisson
Jorin Mamen
Max Pinkney
Susannah Lee
Elvijs Sarkans
Luca Annecchino
Ben Appleton
Philip Garsed
Bret Patterson
Samuel Gonshaw
Matjaž Jakopec
Sudhakaran Shunmugam
Tristan Edwards
Aleksi Tukiainen
Joel Jennings … (see 3 more)
Emil Hewage
Oliver Armitage
Expressivity of Neural Networks with Fixed Weights and Learned Biases
Ezekiel Williams
Avery Hee-Woon Ryoo
Thomas Jiralerspong
Alexandre Payeur
Luca Mazzucato
Does learning the right latent variables necessarily improve in-context learning?
Sarthak Mittal
Eric Elmoznino
L'eo Gagnon
Sangnie Bhardwaj
Large autoregressive models like Transformers can solve tasks through in-context learning (ICL) without learning new weights, suggesting ave… (see more)nues for efficiently solving new tasks. For many tasks, e.g., linear regression, the data factorizes: examples are independent given a task latent that generates the data, e.g., linear coefficients. While an optimal predictor leverages this factorization by inferring task latents, it is unclear if Transformers implicitly do so or if they instead exploit heuristics and statistical shortcuts enabled by attention layers. Both scenarios have inspired active ongoing work. In this paper, we systematically investigate the effect of explicitly inferring task latents. We minimally modify the Transformer architecture with a bottleneck designed to prevent shortcuts in favor of more structured solutions, and then compare performance against standard Transformers across various ICL tasks. Contrary to intuition and some recent works, we find little discernible difference between the two; biasing towards task-relevant latent variables does not lead to better out-of-distribution performance, in general. Curiously, we find that while the bottleneck effectively learns to extract latent task variables from context, downstream processing struggles to utilize them for robust prediction. Our study highlights the intrinsic limitations of Transformers in achieving structured ICL solutions that generalize, and shows that while inferring the right latents aids interpretability, it is not sufficient to alleviate this problem.
Assistive sensory-motor perturbations influence learned neural representations
Pavithra Rajeswaran
Alexandre Payeur
Amy L. Orsborn
Task errors are used to learn and refine motor skills. We investigated how task assistance influences learned neural representations using B… (see more)rain-Computer Interfaces (BCIs), which map neural activity into movement via a decoder. We analyzed motor cortex activity as monkeys practiced BCI with a decoder that adapted to improve or maintain performance over days. Population dimensionality remained constant or increased with learning, counter to trends with non-adaptive BCIs. Yet, over time, task information was contained in a smaller subset of neurons or population modes. Moreover, task information was ultimately stored in neural modes that occupied a small fraction of the population variance. An artificial neural network model suggests the adaptive decoders contribute to forming these compact neural representations. Our findings show that assistive decoders manipulate error information used for long-term learning computations, like credit assignment, which informs our understanding of motor learning and has implications for designing real-world BCIs.
Online Bayesian optimization of vagus nerve stimulation.
Lorenz Wernisch
Tristan Edwards
Antonin Berthon
Olivier Tessier-Lariviere
Elvijs Sarkans
Myrta Stoukidi
Pascal Fortier-Poisson
Max Pinkney
Michael Thornton
Catherine Hanley
Susannah Lee
Joel Jennings
Ben Appleton
Philip Garsed
Bret Patterson
Buttinger Will
Samuel Gonshaw
Matjaž Jakopec
Sudhakaran Shunmugam
Jorin Mamen … (see 4 more)
Aleksi Tukiainen
Oliver Armitage
Emil Hewage
OBJECTIVE In bioelectronic medicine, neuromodulation therapies induce neural signals to the brain or organs, modifying their function. Stimu… (see more)lation devices capable of triggering exogenous neural signals using electrical waveforms require a complex and multi-dimensional parameter space to control such waveforms. Determining the best combination of parameters (waveform optimization or dosing) for treating a particular patient's illness is therefore challenging. Comprehensive parameter searching for an optimal stimulation effect is often infeasible in a clinical setting due to the size of the parameter space. Restricting this space, however, may lead to suboptimal therapeutic results, reduced responder rates, and adverse effects. Approach. As an alternative to a full parameter search, we present a flexible machine learning, data acquisition, and processing framework for optimizing neural stimulation parameters, requiring as few steps as possible using Bayesian optimization. This optimization builds a model of the neural and physiological responses to stimulations, enabling it to optimize stimulation parameters and provide estimates of the accuracy of the response model. The vagus nerve innervates, among other thoracic and visceral organs, the heart, thus controlling heart rate, making it an ideal candidate for demonstrating the effectiveness of our approach. Main results. The efficacy of our optimization approach was first evaluated on simulated neural responses, then applied to vagus nerve stimulation intraoperatively in porcine subjects. Optimization converged quickly on parameters achieving target heart rates and optimizing neural B-fiber activations despite high intersubject variability. Significance. An optimized stimulation waveform was achieved in real time with far fewer stimulations than required by alternative optimization strategies, thus minimizing exposure to side effects. Uncertainty estimates helped avoiding stimulations outside a safe range. Our approach shows that a complex set of neural stimulation parameters can be optimized in real-time for a patient to achieve a personalized precision dosing. .
Learning and Aligning Structured Random Feature Networks
Vivian White
Muawiz Sajjad Chaudhary
Kameron Decker Harris
Artificial neural networks (ANNs) are considered "black boxes'' due to the difficulty of interpreting their learned weights. While choosing… (see more) the best features is not well understood, random feature networks (RFNs) and wavelet scattering ground some ANN learning mechanisms in function space with tractable mathematics. Meanwhile, the genetic code has evolved over millions of years, shaping the brain to develop variable neural circuits with reliable structure that resemble RFNs. We explore a similar approach, embedding neuro-inspired, wavelet-like weights into multilayer RFNs. These can outperform scattering and have kernels that describe their function space at large width. We build learnable and deeper versions of these models where we can optimize separate spatial and channel covariances of the convolutional weight distributions. We find that these networks can perform comparatively with conventional ANNs while dramatically reducing the number of trainable parameters. Channel covariances are most influential, and both weight and activation alignment are needed for classification performance. Our work outlines how neuro-inspired configurations may lead to better performance in key cases and offers a potentially tractable reduced model for ANN learning.
Learning and Aligning Structured Random Feature Networks
Vivian White
Muawiz Sajjad Chaudhary
Kameron Decker Harris
Artificial neural networks (ANNs) are considered ``black boxes'' due to the difficulty of interpreting their learned weights. While choosin… (see more)g the best features is not well understood, random feature networks (RFNs) and wavelet scattering ground some ANN learning mechanisms in function space with tractable mathematics. Meanwhile, the genetic code has evolved over millions of years, shaping the brain to devlop variable neural circuits with reliable structure that resemble RFNs. We explore a similar approach, embedding neuro-inspired, wavelet-like weights into multilayer RFNs. These can outperform scattering and have kernels that describe their function space at large width. We build learnable and deeper versions of these models where we can optimize separate spatial and channel covariances of the convolutional weight distributions. We find that these networks can perform comparatively with conventional ANNs while dramatically reducing the number of trainable parameters. Channel covariances are most influential, and both weight and activation alignment are needed for classification performance. Our work outlines how neuro-inspired configurations may lead to better performance in key cases and offers a potentially tractable reduced model for ANN learning.
Sources of richness and ineffability for phenomenally conscious states
Xu Ji
Eric Elmoznino
George Deane
Axel Constant
Jonathan Simon
Gaussian-process-based Bayesian optimization for neurostimulation interventions in rats
Léo Choinière
Rose Guay-Hottin
Rémi Picard
Numa Dancause
Connectome-based reservoir computing with the conn2res toolbox
Laura E. Suárez
Agoston Mihalik
Filip Milisav
Kenji Marshall
Mingze Li
Petra E. Vértes
Bratislav Mišić
Amortizing intractable inference in large language models
Edward J Hu
Moksh J. Jain
Eric Elmoznino
Younesse Kaddar
Nikolay Malkin
Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This l… (see more)imits tractable querying of this knowledge to start-to-end autoregressive sampling. However, many tasks of interest -- including sequence continuation, infilling, and other forms of constrained generation -- involve sampling from intractable posterior distributions. We address this limitation by using amortized Bayesian inference to sample from these intractable posteriors. Such amortization is algorithmically achieved by fine-tuning LLMs via diversity-seeking reinforcement learning algorithms: generative flow networks (GFlowNets). We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training and reward-maximizing policy optimization. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem and demonstrate that our approach enables data-efficient adaptation of LLMs to tasks that require multi-step rationalization and tool use.