Portrait de Guillaume Lajoie

Guillaume Lajoie

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur agrégé, Université de Montréal, Département de mathématiques et statistiques
Chercheur invité, Google
Sujets de recherche
Apprentissage de représentations
Apprentissage profond
Cognition
IA en santé
IA pour la science
Neurosciences computationnelles
Optimisation
Raisonnement
Réseaux de neurones récurrents
Systèmes dynamiques

Biographie

Guillaume Lajoie est professeur agrégé au Département de mathématiques et de statistiques (DMS) de l'Université de Montréal et membre académique principal de Mila – Institut québécois d’intelligence artificielle. Il est titulaire d'une chaire CIFAR (CCAI Canada) ainsi que d'une chaire de recherche du Canada (CRC) en calcul et interfaçage neuronaux.

Ses recherches sont positionnées à l'intersection de l'IA et des neurosciences où il développe des outils pour mieux comprendre les mécanismes d'intelligence communs aux systèmes biologiques et artificiels. Les contributions de son groupe de recherche vont des progrès des paradigmes d'apprentissage à plusieurs échelles pour les grands systèmes artificiels aux applications en neurotechnologie. Dr. Lajoie participe activement aux efforts de développement responsables de l'IA, cherchant à identifier les lignes directrices et les meilleures pratiques pour l'utilisation de l'IA dans la recherche et au-delà.

Étudiants actuels

Collaborateur·rice de recherche - ETH Zurich
Visiteur de recherche indépendant
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Postdoctorat - UdeM
Doctorat - UdeM
Postdoctorat - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Maîtrise recherche - Polytechnique
Collaborateur·rice de recherche - Western Washington University (faculty; assistant prof))
Collaborateur·rice de recherche - UdeM
Doctorat - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Doctorat - UdeM
Collaborateur·rice de recherche - UdeM
Collaborateur·rice de recherche
Collaborateur·rice alumni - UdeM
Maîtrise recherche - UdeM
Doctorat - UdeM
Doctorat - McGill
Postdoctorat - UdeM
Stagiaire de recherche - Western Washington University

Publications

Goal-driven optimization of single-neuron properties in artificial networks reveals regularization role of neural diversity and adaptation in the brain
Victor Geadah
Stefan Horoi
Giancarlo Kerg
Neurons in the brain have rich and adaptive input-output properties. Features such as diverse f-I curves and spike frequency adaptation are … (voir plus)known to place single neurons in optimal coding regimes when facing changing stimuli. Yet, it is still unclear how brain circuits exploit single neuron flexibility, and how network-level requirements may have shaped such cellular function. To answer this question, a multi-scaled approach is needed where the computations of single neurons and of neural circuits must be considered as a complete system. In this work, we use artificial neural networks to systematically investigate single neuron input-output adaptive mechanisms, optimized in an end-to-end fashion. Throughout the optimization process, each neuron has the liberty to modify its nonlinear activation function, parametrized to mimic f-I curves of biological neurons, and to learn adaptation strategies to modify activation functions in real-time during a task. We find that such networks show much-improved robustness to noise and changes in input statistics. Importantly, we find that this procedure recovers precise coding strategies found in biological neurons, such as gain scaling and fractional order differentiation/integration. Using tools from dynamical systems theory, we analyze the role of these emergent single neuron properties and argue that neural diversity and adaptation plays an active regularization role that enables neural circuits to optimally propagate information across time.
Goal-driven optimization of single-neuron properties in artificial networks reveals regularization role of neural diversity and adaptation in the brain
Victor Geadah
Stefan Horoi
Giancarlo Kerg
Neurons in the brain have rich and adaptive input-output properties. Features such as diverse f-I curves and spike frequency adaptation are … (voir plus)known to place single neurons in optimal coding regimes when facing changing stimuli. Yet, it is still unclear how brain circuits exploit single neuron flexibility, and how network-level requirements may have shaped such cellular function. To answer this question, a multi-scaled approach is needed where the computations of single neurons and of neural circuits must be considered as a complete system. In this work, we use artificial neural networks to systematically investigate single neuron input-output adaptive mechanisms, optimized in an end-to-end fashion. Throughout the optimization process, each neuron has the liberty to modify its nonlinear activation function, parametrized to mimic f-I curves of biological neurons, and to learn adaptation strategies to modify activation functions in real-time during a task. We find that such networks show much-improved robustness to noise and changes in input statistics. Importantly, we find that this procedure recovers precise coding strategies found in biological neurons, such as gain scaling and fractional order differentiation/integration. Using tools from dynamical systems theory, we analyze the role of these emergent single neuron properties and argue that neural diversity and adaptation plays an active regularization role that enables neural circuits to optimally propagate information across time.
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Aristide Baratin
Among attempts at giving a theoretical account of the success of deep neural networks, a recent line of work has identified a so-called `laz… (voir plus)y' training regime in which the network can be well approximated by its linearization around initialization. Here we investigate the comparative effect of the lazy (linear) and feature learning (non-linear) regimes on subgroups of examples based on their difficulty. Specifically, we show that easier examples are given more weight in feature learning mode, resulting in faster training compared to more difficult ones. In other words, the non-linear dynamics tends to sequentialize the learning of examples of increasing difficulty. We illustrate this phenomenon across different ways to quantify example difficulty, including c-score, label noise, and in the presence of easy-to-learn spurious correlations. Our results reveal a new understanding of how deep networks prioritize resources across example difficulty.
Is a Modular Architecture Enough?
Inspired from human cognition, machine learning systems are gradually revealing advantages of sparser and more modular architectures. Recent… (voir plus) work demonstrates that not only do some modular architectures generalize well, but they also lead to better out of distribution generalization, scaling properties, learning speed, and interpretability. A key intuition behind the success of such systems is that the data generating system for most real-world settings is considered to consist of sparse modular connections, and endowing models with similar inductive biases will be helpful. However, the field has been lacking in a rigorous quantitative assessment of such systems because these real-world data distributions are complex and unknown. In this work, we provide a thorough assessment of common modular architectures, through the lens of simple and known modular data distributions. We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems. In doing so, we propose evaluation metrics that highlight the benefits of modularity, the regimes in which these benefits are substantial, as well as the sub-optimality of current end-to-end learned modular systems as opposed to their claimed potential.
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sékou-Oumar Kaba
We identify and formalize a fundamental gradient descent phenomenon resulting in a learning proclivity in over-parameterized neural networks… (voir plus). Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task, despite the presence of other predictive features that fail to be discovered. This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks. Using tools from Dynamical Systems theory, we identify simple properties of learning dynamics during gradient descent that lead to this imbalance, and prove that such a situation can be expected given certain statistical structure in training data. Based on our proposed formalism, we develop guarantees for a novel regularization method aimed at decoupling feature learning dynamics, improving accuracy and robustness in cases hindered by gradient starvation. We illustrate our findings with simple and real-world out-of-distribution (OOD) generalization experiments.
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning
Nan Rosemary Ke
Aniket Rajiv Didolkar
Sarthak Mittal
Anirudh Goyal
Stefan Bauer
Danilo Jimenez Rezende
Michael Curtis Mozer
Inducing causal relationships from observations is a classic problem in machine learning. Most work in causality starts from the premise tha… (voir plus)t the causal variables themselves are observed. However, for AI agents such as robots trying to make sense of their environment, the only observables are low-level variables like pixels in images. To generalize well, an agent must induce high-level variables, particularly those which are causal or are affected by causal variables. A central goal for AI and causality is thus the joint discovery of abstract representations and causal structure. However, we note that existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs which are impossible to manipulate parametrically (e.g., number of nodes, sparsity, causal chain length, etc.). In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them. In order to systematically probe the ability of methods to identify these variables and structures, we design a suite of benchmarking RL environments. We evaluate various representation learning algorithms from the literature and find that explicitly incorporating structure and modularity in models can help causal induction in model-based reinforcement learning.
Learning function from structure in neuromorphic networks
Laura E. Suárez
Bratislav Mišić
Learning Brain Dynamics With Coupled Low-Dimensional Nonlinear Oscillators and Deep Recurrent Networks
Germán Abrevaya
Aleksandr Y. Aravkin
Peng Zheng
Jean-Christophe Gagnon-Audet
James Kozloski
Pablo Polosecki
David Cox
Silvina Ponce Dawson
Guillermo Cecchi
Many natural systems, especially biological ones, exhibit complex multivariate nonlinear dynamical behaviors that can be hard to capture by … (voir plus)linear autoregressive models. On the other hand, generic nonlinear models such as deep recurrent neural networks often require large amounts of training data, not always available in domains such as brain imaging; also, they often lack interpretability. Domain knowledge about the types of dynamics typically observed in such systems, such as a certain type of dynamical systems models, could complement purely data-driven techniques by providing a good prior. In this work, we consider a class of ordinary differential equation (ODE) models known as van der Pol (VDP) oscil lators and evaluate their ability to capture a low-dimensional representation of neural activity measured by different brain imaging modalities, such as calcium imaging (CaI) and fMRI, in different living organisms: larval zebrafish, rat, and human. We develop a novel and efficient approach to the nontrivial problem of parameters estimation for a network of coupled dynamical systems from multivariate data and demonstrate that the resulting VDP models are both accurate and interpretable, as VDP's coupling matrix reveals anatomically meaningful excitatory and inhibitory interactions across different brain subsystems. VDP outperforms linear autoregressive models (VAR) in terms of both the data fit accuracy and the quality of insight provided by the coupling matrices and often tends to generalize better to unseen data when predicting future brain activity, being comparable to and sometimes better than the recurrent neural networks (LSTMs). Finally, we demonstrate that our (generative) VDP model can also serve as a data-augmentation tool leading to marked improvements in predictive accuracy of recurrent neural networks. Thus, our work contributes to both basic and applied dimensions of neuroimaging: gaining scientific insights and improving brain-based predictive models, an area of potentially high practical importance in clinical diagnosis and neurotechnology.
PNS-GAN: Conditional Generation of Peripheral Nerve Signals in the Wavelet Domain via Adversarial Networks
Olivier Tessier-Lariviere
Luke Y. Prince
Pascal Fortier-Poisson
Lorenz Wernisch
Oliver Armitage
Emil Hewage
Simulated datasets of neural recordings are a crucial tool in neural engineering for testing the ability of decoding algorithms to recover k… (voir plus)nown ground-truth. In this work, we introduce PNS-GAN, a generative adversarial network capable of producing realistic nerve recordings conditioned on physiological biomarkers. PNS-GAN operates in the wavelet domain to preserve both the timing and frequency of neural events with high resolution. PNS-GAN generates sequences of scaleograms from noise using a recurrent neural network and 2D transposed convolution layers. PNS-GAN discriminates over stacks of scaleograms with a network of 3D convolution layers. We find that our generated signal reproduces a number of characteristics of the real signal, including similarity in a canonical time-series feature-space, and contains physiologically related neural events including respiration modulation and similar distributions of afferent and efferent signalling.
Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth Mover's Distance
Alexander Tong
Guillaume Huguet
Dennis Shung
Amine Natik
Manik Kuchroo
In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observation… (voir plus)s in many domains. Further
Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth Mover's Distance
Alexander Tong
Guillaume Huguet
Dennis L. Shung
Amine Natik
Manik Kuchroo
In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observation… (voir plus)s in many domains. Further
Implicit Regularization in Deep Learning: A View from Function Space
Aristide Baratin
Thomas George
César Laurent
We approach the problem of implicit regularization in deep learning from a geometrical viewpoint. We highlight a possible regularization eff… (voir plus)ect induced by a dynamical alignment of the neural tangent features introduced by Jacot et al, along a small number of task-relevant directions. By extrapolating a new analysis of Rademacher complexity bounds in linear models, we propose and study a new heuristic complexity measure for neural networks which captures this phenomenon, in terms of sequences of tangent kernel classes along in the learning trajectories.