Simon Lacoste-Julien

Membre académique principal

Chaire en IA Canada-CIFAR

Directeur scientifique adjoint, Mila, Professeur agrégé, Université de Montréal, Département d'informatique et de recherche opérationnelle

Vice-président et directeur de laboratoire, Samsung Advanced Institute of Technology (SAIT) AI Lab, Montréal

Site web

Google Scholar

Biographie

Simon Lacoste-Julien est professeur agrégé au Département d'informatique et de recherche opérationnelle (DIRO) de l'Université de Montréal, membre cofondateur de Mila – Institut québécois d’intelligence artificielle et titulaire d'une chaire en IA Canada-CIFAR. Il dirige également à temps partiel le SAIT AI Lab Montréal.

Ses recherches portent sur l'apprentissage automatique et les mathématiques appliquées, et intègrent des applications à la vision artificielle et au traitement du langage naturel. Il a obtenu une licence en mathématiques, physique et informatique à l’Université McGill, un doctorat en informatique à l’Université de Californie à Berkeley et un postdoctorat à l'Université de Cambridge.

Il a passé quelques années à l'Institut national de recherche en sciences et technologies du numérique (INRIA) et à l'École normale supérieure de Paris en tant que professeur de recherche avant de revenir à Montréal, en 2016, pour répondre à l'appel de Yoshua Bengio et contribuer à la croissance de l'écosystème de l'IA à Montréal.

Étudiants actuels

Reza Babanezhad Harikandeh

Visiteur de recherche indépendant - Samsung SAIT

babanezr@mila.quebec

Aristide Baratin

Visiteur de recherche indépendant - Samsung SAIT

Doctorat - UdeM

vitoria.barin-pacela@mila.quebec

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Gauthier Gidel

quentin.bertrand@mila.quebec

Site web

Github

Google Scholar

Kiho Cho

Visiteur de recherche indépendant - Samsung SAIT

kiho.cho@mila.quebec

Marwa El Halabi

Visiteur de recherche indépendant - Samsung SAIT

marwa.el-halabi@mila.quebec

Doctorat - UdeM

Doctorat - UdeM

antonio-miguel.gois@mila.quebec

Site web

Github

Google Scholar

Yash Goyal

Visiteur de recherche indépendant - Samsung SAIT

yash.goyal@mila.quebec

Site web

Meraj Hashemizadeh

Collaborateur·rice de recherche - UdeM

merajhse@mila.quebec

Github

Alexia Jolicoeur-Martineau

Visiteur de recherche indépendant - Samsung SAIT

Doctorat - UdeM

pedram.khorsandi@mila.quebec

Github

Google Scholar

Kwon Kisoo

Visiteur de recherche indépendant - Seoul National University, Korea

kwon.kisoo@mila.quebec

Boris Knyazev

Visiteur de recherche indépendant - UdeM

Doctorat - UdeM

Jaewoo Lee

Visiteur de recherche indépendant - Pohang University of Science and Technology in Pohang, Korea

jaewoo.lee@mila.quebec

Michelle Liu

Collaborateur·rice de recherche

liumiche@mila.quebec

Lucas Maes

Doctorat - UdeM

lucas.maes@mila.quebec

Site web

Github

Rozhin Nobahari

Maîtrise recherche - UdeM

rozhin.nobahari@mila.quebec

George Orfanides

Doctorat - McGill

Superviseur⋅e principal⋅e :

Adam M. Oberman

george.orfanides@mila.quebec

Juan Ramirez

Doctorat - UdeM

juan.ramirez@mila.quebec

Doctorat - UdeM

mansi.rankawat@mila.quebec

Visiteur de recherche indépendant - Samsung SAIT

Maîtrise recherche - UdeM

motahareh.sohrabi@mila.quebec

Site web

Helen Zhang

Doctorat - UdeM

tianyue.zhang@mila.quebec

Site web

Github

Yan Zhang

Visiteur de recherche indépendant - Samsung SAIT

yan.zhang@mila.quebec

Site web

Github

Google Scholar

Billets de blogue

Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation

18 mars 2024

Décodeurs additifs pour l’identification des variables latentes et l’extrapolation du produit cartésien

par

Sébastien Lachapelle

Divyat Mahajan

Ioannis Mitliagkas

Simon Lacoste-Julien

Lire l'article

Publications

Promoting Exploration in Memory-Augmented Adam using Critical Momenta

Pranshu Malviya

Goncalo Mordido

Aristide Baratin

Reza Babanezhad Harikandeh

Jerry Huang

Simon Lacoste-Julien

Razvan Pascanu

Sarath Chandar Anbil Parthipan

Adaptive gradient-based optimizers, particularly Adam, have left their mark in training large-scale deep learning models. The strength of su… (voir plus)ch optimizers is that they exhibit fast convergence while being more robust to hyperparameter choice. However, they often generalize worse than non-adaptive methods. Recent studies have tied this performance gap to flat minima selection: adaptive methods tend to find solutions in sharper basins of the loss landscape, which in turn hurts generalization. To overcome this issue, we propose a new memory-augmented version of Adam that promotes exploration towards flatter minima by using a buffer of critical momentum terms during training. Intuitively, the use of the buffer makes the optimizer overshoot outside the basin of attraction if it is not wide enough. We empirically show that our method improves the performance of several variants of Adam on standard supervised language modelling and image classification tasks.

2024-06-09

TMLR (accepté)

doi.org

openreview.net

Weight-Sharing Regularization

Mehran Shakerinava

Motahareh Sohrabi

Siamak Ravanbakhsh

Simon Lacoste-Julien

2024-04-18

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (publié)

doi.org

arxiv.org

PopulAtion Parameter Averaging (PAPA)

Alexia Jolicoeur-Martineau

Emy Gervais

Kilian FATRAS

Yan Zhang

Simon Lacoste-Julien

2024-04-05

TMLR (accepté)

doi.org

openreview.net

On the Identifiability of Quantized Factors

Vitória Barin Pacela

Kartik Ahuja

Simon Lacoste-Julien

Pascal Vincent

Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the… (voir plus) theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.

2024-03-15

Proceedings of the Third Conference on Causal Learning and Reasoning (publié)

proceedings.mlr.press

arxiv.org

Balancing Act: Constraining Disparate Impact in Sparse Models

Meraj Hashemizadeh

Juan Ramirez

Rohan Sukumaran

Golnoosh Farnadi

Simon Lacoste-Julien

Jose Gallego-Posada

Model pruning is a popular approach to enable the deployment of large deep learning models on edge devices with restricted computational or … (voir plus)storage capacities. Although sparse models achieve performance comparable to that of their dense counterparts at the level of the entire dataset, they exhibit high accuracy drops for some data sub-groups. Existing methods to mitigate this disparate impact induced by pruning (i) rely on surrogate metrics that address the problem indirectly and have limited interpretability; or (ii) scale poorly with the number of protected sub-groups in terms of computational cost. We propose a constrained optimization approach that directly addresses the disparate impact of pruning: our formulation bounds the accuracy change between the dense and sparse models, for each sub-group. This choice of constraints provides an interpretable success criterion to determine if a pruned model achieves acceptable disparity levels. Experimental results demonstrate that our technique scales reliably to problems involving large models and hundreds of protected sub-groups.

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies

Sébastien Lachapelle

Pau Rodriguez

Yash Sharma

Katie Everett

Rémi LE PRIOL

Alexandre Lacoste

Simon Lacoste-Julien

2024-01-10

ArXiv (prépublication)

doi.org

arxiv.org

Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation

Sébastien Lachapelle

Divyat Mahajan

Ioannis Mitliagkas

Simon Lacoste-Julien

We tackle the problems of latent variables identification and "out-of-support'' image generation in representation learning. We show that bo… (voir plus)th are possible for a class of decoders that we call additive, which are reminiscent of decoders used for object-centric representation learning (OCRL) and well suited for images that can be decomposed as a sum of object-specific images. We provide conditions under which exactly solving the reconstruction problem using an additive decoder is guaranteed to identify the blocks of latent variables up to permutation and block-wise invertible transformations. This guarantee relies only on very weak assumptions about the distribution of the latent factors, which might present statistical dependencies and have an almost arbitrarily shaped support. Our result provides a new setting where nonlinear independent component analysis (ICA) is possible and adds to our theoretical understanding of OCRL methods. We also show theoretically that additive decoders can generate novel images by recombining observed factors of variations in novel ways, an ability we refer to as Cartesian-product extrapolation. We show empirically that additivity is crucial for both identifiability and extrapolation on simulated data.

openreview.net

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

Boris Knyazev

DOHA HWANG

Simon Lacoste-Julien

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (publié)

doi.org

openreview.net

CrossSplit: Mitigating Label Noise Memorization through Data Splitting

Jihye Kim

Aristide Baratin

Yan Zhang

Simon Lacoste-Julien

We approach the problem of improving robustness of deep learning algorithms in the presence of label noise. Building upon existing label cor… (voir plus)rection and co-teaching methods, we propose a novel training procedure to mitigate the memorization of noisy labels, called CrossSplit, which uses a pair of neural networks trained on two disjoint parts of the labeled dataset. CrossSplit combines two main ingredients: (i) Cross-split label correction. The idea is that, since the model trained on one part of the data cannot memorize example-label pairs from the other part, the training labels presented to each network can be smoothly adjusted by using the predictions of its peer network; (ii) Cross-split semi-supervised training. A network trained on one part of the data also uses the unlabeled inputs of the other part. Extensive experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and mini-WebVision datasets demonstrate that our method can outperform the current state-of-the-art in a wide range of noise ratios. The project page is at https://rlawlgul.github.io/.

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (publié)

doi.org

openreview.net

Unlocking Slot Attention by Changing Optimal Transport Costs

Yan Zhang

David W Zhang

Simon Lacoste-Julien

Gertjan J. Burghouts

Cees G. M. Snoek

Slot attention is a powerful method for object-centric modeling in images and videos. However, its set-equivariance limits its ability to ha… (voir plus)ndle videos with a dynamic number of objects because it cannot break ties. To overcome this limitation, we first establish a connection between slot attention and optimal transport. Based on this new perspective we propose **MESH** (Minimize Entropy of Sinkhorn): a cross-attention module that combines the tiebreaking properties of unregularized optimal transport with the speed of regularized optimal transport. We evaluate slot attention using MESH on multiple object-centric learning benchmarks and find significant improvements over slot attention in every setting.

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (publié)

doi.org

openreview.net

On the Identifiability of Quantized Factors

Vitória Barin Pacela

Kartik Ahuja

Simon Lacoste-Julien

Pascal Vincent

Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the … (voir plus)theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.

2023-06-28

ArXiv (prépublication)

arxiv.org

Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection

Vitória Barin-Pacela

Kartik Ahuja

Simon Lacoste-Julien

Pascal Vincent

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Simon Lacoste-Julien

Biographie

Étudiants actuels

Billets de blogue

Publications

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Simon Lacoste-Julien

Biographie

Étudiants actuels

Billets de blogue

Publications