Guillaume Lajoie

Biographie

Guillaume Lajoie est professeur agrégé au Département de mathématiques et de statistiques (DMS) de l'Université de Montréal et membre académique principal de Mila – Institut québécois d’intelligence artificielle. Il est titulaire d'une chaire CIFAR (CCAI Canada) ainsi que d'une chaire de recherche du Canada (CRC) en calcul et interfaçage neuronaux.

Ses recherches sont positionnées à l'intersection de l'IA et des neurosciences où il développe des outils pour mieux comprendre les mécanismes d'intelligence communs aux systèmes biologiques et artificiels. Les contributions de son groupe de recherche vont des progrès des paradigmes d'apprentissage à plusieurs échelles pour les grands systèmes artificiels aux applications en neurotechnologie. Dr. Lajoie participe activement aux efforts de développement responsables de l'IA, cherchant à identifier les lignes directrices et les meilleures pratiques pour l'utilisation de l'IA dans la recherche et au-delà.

Étudiants actuels

Federico Arangath Joseph

Collaborateur·rice de recherche - ETH Zurich

Stefan Bauer

Visiteur de recherche indépendant

Superviseur⋅e principal⋅e :

Yoshua Bengio

Sangnie Bhardwaj

Doctorat - UdeM

Co-superviseur⋅e :

Hugo Larochelle

Colin Bredenberg

Postdoctorat - UdeM

Co-superviseur⋅e :

Leo Choiniere

Doctorat - UdeM

Olivier Codol

Postdoctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Leo Gagnon

Doctorat - UdeM

Skylar Gu

Stagiaire de recherche - McGill

Superviseur⋅e principal⋅e :

Dhanya Sridhar

Juan Guerra

Maîtrise recherche - Polytechnique

Superviseur⋅e principal⋅e :

Nanda Harishankar Krishna

Doctorat - UdeM

Collaborateur·rice de recherche - Western Washington University (faculty; assistant prof))

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Maîtrise recherche - UdeM

Co-superviseur⋅e :

Dhanya Sridhar

tejaskasetty@gmail.com

Stagiaire de recherche - Concordia

Co-superviseur⋅e :

Matt Perich

Site web

Ximeng Mao

Doctorat - UdeM

Co-superviseur⋅e :

Joelle Pineau

Abdel Mfougouon Njupoun

Doctorat - UdeM

Co-superviseur⋅e :

abdelnjupoun@gmail.com

Doctorat - UdeM

Co-superviseur⋅e :

Amine Natik

Doctorat - UdeM

Co-superviseur⋅e :

Guy Wolf

Alexandre Payeur

Collaborateur·rice de recherche - UdeM

Mohammad Pezeshki

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - McGill

Superviseur⋅e principal⋅e :

Julia Price

Maîtrise recherche - UdeM

Avery Ryoo

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Lune Bellec

Ayesha Vermani

Visiteur de recherche indépendant - Champalimeau Institute for the Unknown

Ryan Vogt

Postdoctorat - UdeM

Apprentissage automatique pour la segmentation des différentes activations des fibres nerveuses à partir des signaux neuronaux du cerveau vers le corps

Vivian White

Stagiaire de recherche - Western Washington University

Co-superviseur⋅e :

Doctorat - UdeM

Billets de blogue

Représentation graphique d'un nerf vague

21 mai 2025

par

Param Raval

Olivier Tessier-Larivière

Pascal Fortier-Poisson

Blake Richards

Guillaume Lajoie

Lire l'article

13 juin 2024

Que nous apprennent les distributions des coefficients synaptiques au sujet de l’apprentissage dans le cerveau ?

par

Roman Pogodin

Jonathan Cornford

Arna Ghosh

Gauthier Gidel

Guillaume Lajoie

Blake Richards

Lire l'article

Publications

Transfer Entropy Bottleneck: Learning Sequence to Sequence Information Transfer

Damjan Kalajdzievski

Ximeng Mao

Pascal Fortier-Poisson

When presented with a data stream of two statistically dependent variables, predicting the future of one of the variables (the target stream… (voir plus)) can benefit from information about both its history and the history of the other variable (the source stream). For example, fluctuations in temperature at a weather station can be predicted using both temperatures and barometric readings. However, a challenge when modelling such data is that it is easy for a neural network to rely on the greatest joint correlations within the target stream, which may ignore a crucial but small information transfer from the source to the target stream. As well, there are often situations where the target stream may have previously been modelled independently and it would be useful to use that model to inform a new joint model. Here, we develop an information bottleneck approach for conditional learning on two dependent streams of data. Our method, which we call Transfer Entropy Bottleneck (TEB), allows one to learn a model that bottlenecks the directed information transferred from the source variable to the target variable, while quantifying this information transfer within the model. As such, TEB provides a useful new information bottleneck approach for modelling two statistically dependent streams of data in order to make predictions about one of them.

2023-03-08

TMLR (accepté)

Use of Invasive Brain-Computer Interfaces in Pediatric Neurosurgery: Technical and Ethical Considerations

David Bergeron

Christian Iorio-Morin

Marco Bonizzato

Nathalie Orr Gaucher

Éric Racine

Alexander G. Weil

2023-03-01

Journal of Child Neurology (publié)

Steerable Equivariant Representation Learning

Sangnie Bhardwaj

Willie McClinton

Tongzhou Wang

Chen Sun

Phillip Isola

Dilip Krishnan

Pre-trained deep image representations are useful for post-training tasks such as classification through transfer learning, image retrieval,… (voir plus) and object detection. Data augmentations are a crucial aspect of pre-training robust representations in both supervised and self-supervised settings. Data augmentations explicitly or implicitly promote invariance in the embedding space to the input image transformations. This invariance reduces generalization to those downstream tasks which rely on sensitivity to these particular data augmentations. In this paper, we propose a method of learning representations that are instead equivariant to data augmentations. We achieve this equivariance through the use of steerable representations. Our representations can be manipulated directly in embedding space via learned linear maps. We demonstrate that our resulting steerable and equivariant representations lead to better performance on transfer learning and robustness: e.g. we improve linear probe top-1 accuracy by between 1% to 3% for transfer; and ImageNet-C accuracy by upto 3.4%. We further show that the steerability of our representations provides significant speedup (nearly 50x) for test-time augmentations; by applying a large number of augmentations for out-of-distribution detection, we significantly improve OOD AUC on the ImageNet-C dataset over an invariant representation.

2023-02-22

ArXiv (preprint)

Sources of richness and ineffability for phenomenally conscious states

Xu Ji

Eric Elmoznino

George Deane

Axel Constant

Guillaume Dumas

Jonathan Simon

Yoshua Bengio

Abstract Conscious states—state that there is something it is like to be in—seem both rich or full of detail and ineffable or hard to fu… (voir plus)lly describe or recall. The problem of ineffability, in particular, is a longstanding issue in philosophy that partly motivates the explanatory gap: the belief that consciousness cannot be reduced to underlying physical processes. Here, we provide an information theoretic dynamical systems perspective on the richness and ineffability of consciousness. In our framework, the richness of conscious experience corresponds to the amount of information in a conscious state and ineffability corresponds to the amount of information lost at different stages of processing. We describe how attractor dynamics in working memory would induce impoverished recollections of our original experiences, how the discrete symbolic nature of language is insufficient for describing the rich and high-dimensional structure of experiences, and how similarity in the cognitive function of two individuals relates to improved communicability of their experiences to each other. While our model may not settle all questions relating to the explanatory gap, it makes progress toward a fully physicalist explanation of the richness and ineffability of conscious experience—two important aspects that seem to be part of what makes qualitative character so puzzling.

2023-02-13

ArXiv (prépublication)

arxiv.org

Sources of richness and ineffability for phenomenally conscious states

Xu Ji

Eric Elmoznino

George Deane

Axel Constant

Guillaume Dumas

Jonathan Simon

Yoshua Bengio

2023-02-13

ArXiv (prépublication)

arxiv.org

How gradient estimator variance and bias impact learning in neural networks

Arna Ghosh

Yuhan Helena Liu

Konrad Paul Kording

There is growing interest in understanding how real brains may approximate gradients and how gradients can be used to train neuromorphic chi… (voir plus)ps. However, neither real brains nor neuromorphic chips can perfectly follow the loss gradient, so parameter updates would necessarily use gradient estimators that have some variance and/or bias. Therefore, there is a need to understand better how variance and bias in gradient estimators impact learning dependent on network and task properties. Here, we show that variance and bias can impair learning on the training data, but some degree of variance and bias in a gradient estimator can be beneficial for generalization. We find that the ideal amount of variance and bias in a gradient estimator are dependent on several properties of the network and task: the size and activity sparsity of the network, the norm of the gradient, and the curvature of the loss landscape. As such, whether considering biologically-plausible learning algorithms or algorithms for training neuromorphic chips, researchers can analyze these properties to determine whether their approximation to gradient descent will be effective for learning given their network and task properties.

2023-02-01

ICLR.cc/2023/Conference (poster)

Reliability of CKA as a Similarity Measure in Deep Learning

MohammadReza Davari

Comparing learned neural representations in neural networks is a challenging but important problem, which has been approached in different w… (voir plus)ays. The Centered Kernel Alignment (CKA) similarity metric, particularly its linear variant, has recently become a popular approach and has been widely used to compare representations of a network's different layers, of architecturally similar networks trained differently, or of models with different architectures trained on the same data. A wide variety of claims about similarity and dissimilarity of these various representations have been made using CKA results. In this work we present analysis that formally characterizes CKA sensitivity to a large class of simple transformations, which can naturally occur in the context of modern machine learning. This provides a concrete explanation to CKA sensitivity to outliers, which has been observed in past works, and to transformations that preserve the linear separability of the data, an important generalization attribute. We empirically investigate several weaknesses of the CKA similarity metric, demonstrating situations in which it gives unexpected or counterintuitive results. Finally we study approaches for modifying representations to maintain functional behaviour while changing the CKA value. Our results illustrate that, in many cases, the CKA value can be easily manipulated without substantial changes to the functional behaviour of the models, and call for caution when leveraging activation alignment metrics.

2023-02-01

ICLR.cc/2023/Conference (poster)

« Que notre cerveau soit constitué de neurones n’est pas un accident »

Roman Ikonicoff

2023-01-02

Pour la science (publié)

Formalizing locality for normative synaptic plasticity models

Cristina Savin

H OW GRADIENT ESTIMATOR VARIANCE AND BIAS COULD IMPACT LEARNING IN NEURAL CIRCUITS

Arna Ghosh

Yuhan Helena Liu

Konrad K¨ording

2023-01-01

(publié)

www.semanticscholar.org

LEAD: Min-Max Optimization from a Physical Perspective

Reyhane Askari Hemmat

Amartya Mitra

Ioannis Mitliagkas

Adversarial formulations have rekindled interest in two-player min-max games. A central obstacle in the optimization of such games is the ro… (voir plus)tational dynamics that hinder their convergence. In this paper, we show that game optimization shares dynamic properties with particle systems subject to multiple forces, and one can leverage tools from physics to improve optimization dynamics. Inspired by the physical framework, we propose LEAD, an optimizer for min-max games. Next, using Lyapunov stability theory from dynamical systems as well as spectral analysis, we study LEAD’s convergence properties in continuous and discrete time settings for a class of quadratic min-max games to demonstrate linear convergence to the Nash equilibrium. Finally, we empirically evaluate our method on synthetic setups and CIFAR-10 image generation to demonstrate improvements in GAN training.

2023-01-01

Trans. Mach. Learn. Res. (publié)

NEURAL MANIFOLDS AND GRADIENT-BASED ADAPTATION IN NEURAL-INTERFACE TASKS

Alexandre Payeur

Amy L. Orsborn