Portrait de Guillaume Lajoie

Guillaume Lajoie

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur adjoint, Université de Montréal, Département de mathématiques et statistiques
Consultant, n-a

Biographie

Guillaume Lajoie est professeur adjoint au Département de mathématiques et de statistique (DMS) de l'Université de Montréal et membre académique principal de Mila – Institut québécois d’intelligence artificielle. Il est également chercheur boursier du Fonds de recherche du Québec - Santé (FRQS). Auparavant, il a été chercheur postdoctoral à l'Institut de dynamique et d'auto-organisation Max-Planck et à l'Institut de neuro-ingénierie de l'Université de Washington. Il a obtenu son doctorat à l'Université de Washington (Seattle), au Département de mathématiques appliquées.

Ses recherches, à l’intersection de l’IA et des neurosciences, se penchent sur des questions liées aux dynamiques et aux calculs des réseaux neuronaux, avec certaines applications à la neuro-ingénierie. Ses travaux récents comprennent le développement de biais inductifs pour une meilleure propagation de l’information dans les réseaux récurrents, ainsi que le développement d'algorithmes permettant d’optimiser les interfaces cerveau-machine bidirectionnelles.

Étudiants actuels

Collaborateur·rice de recherche - Université de Montréal
Doctorat - Université de Montréal
Co-superviseur⋅e :
Postdoctorat - Université de Montréal
Co-superviseur⋅e :
Doctorat - Université de Montréal
Superviseur⋅e principal⋅e :
Doctorat - Université de Montréal
Maîtrise recherche - Université de Montréal
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - Université de Montréal
Doctorat - Université de Montréal
Superviseur⋅e principal⋅e :
Maîtrise recherche - Polytechnique Montréal
Superviseur⋅e principal⋅e :
Doctorat - McGill University
Doctorat - Université de Montréal
Doctorat - Université de Montréal
Postdoctorat - Université de Montréal
Co-superviseur⋅e :
Doctorat - Université de Montréal
Co-superviseur⋅e :
Postdoctorat - McGill University
Superviseur⋅e principal⋅e :
Maîtrise recherche - Polytechnique Montréal
Superviseur⋅e principal⋅e :
Doctorat - Université de Montréal
Co-superviseur⋅e :
Doctorat - Université de Montréal
Co-superviseur⋅e :
Visiteur de recherche indépendant
Superviseur⋅e principal⋅e :
Maîtrise professionnelle - Université de Montréal
Collaborateur·rice de recherche - Polytechnique Montréal
Superviseur⋅e principal⋅e :
Stagiaire de recherche - Western Washington University
Co-superviseur⋅e :
Doctorat - Université de Montréal
Co-superviseur⋅e :

Publications

LEAD: Min-Max Optimization from a Physical Perspective
Reyhane Askari Hemmat
Amartya Mitra
Adversarial formulations have rekindled interest in two-player min-max games. A central obstacle in the optimization of such games is the ro… (voir plus)tational dynamics that hinder their convergence. In this paper, we show that game optimization shares dynamic properties with particle systems subject to multiple forces, and one can leverage tools from physics to improve optimization dynamics. Inspired by the physical framework, we propose LEAD, an optimizer for min-max games. Next, using Lyapunov stability theory from dynamical systems as well as spectral analysis, we study LEAD’s convergence properties in continuous and discrete time settings for a class of quadratic min-max games to demonstrate linear convergence to the Nash equilibrium. Finally, we empirically evaluate our method on synthetic setups and CIFAR-10 image generation to demonstrate improvements in GAN training.
conn2res: A toolbox for connectome-based reservoir computing
Laura E. Suárez
Agoston Mihalik
Filip Milisav
Kenji Marshall
Mingze Li
Petra E. Vértes
Bratislav Mišić
Autonomous optimization of neuroprosthetic stimulation parameters that drive the motor cortex and spinal cord outputs in rats and monkeys
Rose Guay Hottin
Sandrine L. Côté
Elena Massai
Léo Choinière
Uzay Macar
Samuel Laferrière
Parikshat Sirpal
Stephan Quessy
Marina Martinez
Numa Dancause
Multi-view manifold learning of human brain state trajectories
Erica Lindsey Busch
Je-chun Huang
Andrew Benz
Tom Wallenstein
Smita Krishnaswamy
Nicholas Turk-Browne
Transfer Entropy Bottleneck: Learning Sequence to Sequence Information Transfer
Damjan Kalajdzievski
Ximeng Mao
Pascal Fortier-Poisson
When presented with a data stream of two statistically dependent variables, predicting the future of one of the variables (the target stream… (voir plus)) can benefit from information about both its history and the history of the other variable (the source stream). For example, fluctuations in temperature at a weather station can be predicted using both temperatures and barometric readings. However, a challenge when modelling such data is that it is easy for a neural network to rely on the greatest joint correlations within the target stream, which may ignore a crucial but small information transfer from the source to the target stream. As well, there are often situations where the target stream may have previously been modelled independently and it would be useful to use that model to inform a new joint model. Here, we develop an information bottleneck approach for conditional learning on two dependent streams of data. Our method, which we call Transfer Entropy Bottleneck (TEB), allows one to learn a model that bottlenecks the directed information transferred from the source variable to the target variable, while quantifying this information transfer within the model. As such, TEB provides a useful new information bottleneck approach for modelling two statistically dependent streams of data in order to make predictions about one of them.
Use of Invasive Brain-Computer Interfaces in Pediatric Neurosurgery: Technical and Ethical Considerations
David Bergeron
Christian Iorio-Morin
Nathalie Orr Gaucher
Éric Racine
Alexander G. Weil
Steerable Equivariant Representation Learning
Sangnie Bhardwaj
Willie McClinton
Tongzhou Wang
Chen Sun
Phillip Isola
Dilip Krishnan
Pre-trained deep image representations are useful for post-training tasks such as classification through transfer learning, image retrieval,… (voir plus) and object detection. Data augmentations are a crucial aspect of pre-training robust representations in both supervised and self-supervised settings. Data augmentations explicitly or implicitly promote invariance in the embedding space to the input image transformations. This invariance reduces generalization to those downstream tasks which rely on sensitivity to these particular data augmentations. In this paper, we propose a method of learning representations that are instead equivariant to data augmentations. We achieve this equivariance through the use of steerable representations. Our representations can be manipulated directly in embedding space via learned linear maps. We demonstrate that our resulting steerable and equivariant representations lead to better performance on transfer learning and robustness: e.g. we improve linear probe top-1 accuracy by between 1% to 3% for transfer; and ImageNet-C accuracy by upto 3.4%. We further show that the steerability of our representations provides significant speedup (nearly 50x) for test-time augmentations; by applying a large number of augmentations for out-of-distribution detection, we significantly improve OOD AUC on the ImageNet-C dataset over an invariant representation.
How gradient estimator variance and bias impact learning in neural networks
Arna Ghosh
Yuhan Helena Liu
Konrad Paul Kording
There is growing interest in understanding how real brains may approximate gradients and how gradients can be used to train neuromorphic chi… (voir plus)ps. However, neither real brains nor neuromorphic chips can perfectly follow the loss gradient, so parameter updates would necessarily use gradient estimators that have some variance and/or bias. Therefore, there is a need to understand better how variance and bias in gradient estimators impact learning dependent on network and task properties. Here, we show that variance and bias can impair learning on the training data, but some degree of variance and bias in a gradient estimator can be beneficial for generalization. We find that the ideal amount of variance and bias in a gradient estimator are dependent on several properties of the network and task: the size and activity sparsity of the network, the norm of the gradient, and the curvature of the loss landscape. As such, whether considering biologically-plausible learning algorithms or algorithms for training neuromorphic chips, researchers can analyze these properties to determine whether their approximation to gradient descent will be effective for learning given their network and task properties.
Reliability of CKA as a Similarity Measure in Deep Learning
MohammadReza Davari
Stefan Horoi
Amine Natik
Comparing learned neural representations in neural networks is a challenging but important problem, which has been approached in different w… (voir plus)ays. The Centered Kernel Alignment (CKA) similarity metric, particularly its linear variant, has recently become a popular approach and has been widely used to compare representations of a network's different layers, of architecturally similar networks trained differently, or of models with different architectures trained on the same data. A wide variety of claims about similarity and dissimilarity of these various representations have been made using CKA results. In this work we present analysis that formally characterizes CKA sensitivity to a large class of simple transformations, which can naturally occur in the context of modern machine learning. This provides a concrete explanation to CKA sensitivity to outliers, which has been observed in past works, and to transformations that preserve the linear separability of the data, an important generalization attribute. We empirically investigate several weaknesses of the CKA similarity metric, demonstrating situations in which it gives unexpected or counterintuitive results. Finally we study approaches for modifying representations to maintain functional behaviour while changing the CKA value. Our results illustrate that, in many cases, the CKA value can be easily manipulated without substantial changes to the functional behaviour of the models, and call for caution when leveraging activation alignment metrics.
« Que notre cerveau soit constitué de neurones n’est pas un accident »
Roman Ikonicoff
Formalizing locality for normative synaptic plasticity models
Colin Bredenberg
Ezekiel Williams
Cristina Savin
H OW GRADIENT ESTIMATOR VARIANCE AND BIAS COULD IMPACT LEARNING IN NEURAL CIRCUITS
Arna Ghosh
Yuhan Helena Liu
Konrad K¨ording
There is growing interest in understanding how real brains may approximate gradients and how gradients can be used to train neuromorphic chi… (voir plus)ps. However, neither real brains nor neuromorphic chips can perfectly follow the loss gradient, so parameter updates would necessarily use gradient estimators that have some variance and/or bias. Therefore, there is a need to understand better how variance and bias in gradient estimators impact learning dependent on network and task properties. Here, we show that variance and bias can impair learning on the training data, but some degree of variance and bias in a gradient estimator can be beneficial for generalization. We find that the ideal amount of variance and bias in a gradient estimator are dependent on several properties of the network and task: the size and activity sparsity of the network, the norm of the gradient, and the curvature of the loss landscape. As such, whether considering biologically-plausible learning algorithms or algorithms for training neuromorphic chips, researchers can analyze these properties to determine whether their approximation to gradient descent will be effective for learning given their network and task properties.