Simon Lacoste-Julien

aristidebaratin@hotmail.com

Biography

Simon Lacoste-Julien is an associate professor at Mila – Quebec Artificial Intelligence Institute and in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal. He is also a Canada CIFAR AI Chair and heads (part time) the SAIT AI Lab Montréal.

Lacoste-Julien‘s research interests are machine learning and applied mathematics, along with their applications to computer vision and natural language processing. He completed a BSc in mathematics, physics and computer science at McGill University, a PhD in computer science at UC Berkeley and a postdoc at the University of Cambridge.

After spending several years as a researcher at INRIA and the École normale supérieure in Paris, he returned to his home city of Montréal in 2016 to answer Yoshua Bengio’s call to help grow the Montréal AI ecosystem.

Current Students

Reza Babanezhad Harikandeh

Independent visiting researcher - Samsung SAIT

Aristide Baratin

Independent visiting researcher - Samsung SAIT

PhD - Université de Montréal

Collaborating Alumni - Université de Montréal

Principal supervisor :

Independent visiting researcher - Samsung

Marwa El Halabi

Independent visiting researcher - Samsung SAIT

Collaborating Alumni - Université de Montréal

PhD - Université de Montréal

Yash Goyal

Independent visiting researcher - Samsung SAIT

yashgoyal.yg1@gmail.com

Alexia Jolicoeur-Martineau

Independent visiting researcher - Samsung SAIT

PhD - Université de Montréal

Boris Knyazev

Independent visiting researcher - Université de Montréal

Independent visiting researcher - Université de Montréal

Lucas Maes

PhD - Université de Montréal

Master's Research - Université de Montréal

Juan Ramirez

PhD - Université de Montréal

PhD - Université de Montréal

Theo Saulus

PhD - Université de Montréal

Co-supervisor :

Dhanya Sridhar

Damien Scieur

Independent visiting researcher - Samsung SAIT

Motahareh Sohrabi

Collaborating Alumni - Université de Montréal

Helen Zhang

PhD - Université de Montréal

Yan Zhang

Independent visiting researcher - Samsung SAIT

Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation

Blog Posts

March 18, 2024

Sébastien Lachapelle

Divyat Mahajan

Ioannis Mitliagkas

Simon Lacoste-Julien

Read the article

Publications

PopulAtion Parameter Averaging (PAPA)

Alexia Jolicoeur-Martineau

Emy Gervais

Kilian FATRAS

Yan Zhang

2024-04-05

TMLR (accepted)

On the Identifiability of Quantized Factors

Vitória Barin Pacela

Kartik Ahuja

Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the… (see more) theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.

2024-03-15

Proceedings of the Third Conference on Causal Learning and Reasoning (published)

proceedings.mlr.press

Balancing Act: Constraining Disparate Impact in Sparse Models

Meraj Hashemizadeh

Juan Ramirez

Rohan Sukumaran

Golnoosh Farnadi

Jose Gallego-Posada

Model pruning is a popular approach to enable the deployment of large deep learning models on edge devices with restricted computational or … (see more)storage capacities. Although sparse models achieve performance comparable to that of their dense counterparts at the level of the entire dataset, they exhibit high accuracy drops for some data sub-groups. Existing methods to mitigate this disparate impact induced by pruning (i) rely on surrogate metrics that address the problem indirectly and have limited interpretability; or (ii) scale poorly with the number of protected sub-groups in terms of computational cost. We propose a constrained optimization approach that directly addresses the disparate impact of pruning: our formulation bounds the accuracy change between the dense and sparse models, for each sub-group. This choice of constraints provides an interpretable success criterion to determine if a pruned model achieves acceptable disparity levels. Experimental results demonstrate that our technique scales reliably to problems involving large models and hundreds of protected sub-groups.

2024-01-16

ICLR.cc/2024/Conference (poster)

Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies

Sébastien Lachapelle

Pau Rodriguez

Yash Sharma

Katie Everett

Rémi LE PRIOL

Alexandre Lacoste

2024-01-10

ArXiv (preprint)

Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation

Sébastien Lachapelle

Divyat Mahajan

Ioannis Mitliagkas

We tackle the problems of latent variables identification and "out-of-support'' image generation in representation learning. We show that bo… (see more)th are possible for a class of decoders that we call additive, which are reminiscent of decoders used for object-centric representation learning (OCRL) and well suited for images that can be decomposed as a sum of object-specific images. We provide conditions under which exactly solving the reconstruction problem using an additive decoder is guaranteed to identify the blocks of latent variables up to permutation and block-wise invertible transformations. This guarantee relies only on very weak assumptions about the distribution of the latent factors, which might present statistical dependencies and have an almost arbitrarily shaped support. Our result provides a new setting where nonlinear independent component analysis (ICA) is possible and adds to our theoretical understanding of OCRL methods. We also show theoretically that additive decoders can generate novel images by recombining observed factors of variations in novel ways, an ability we refer to as Cartesian-product extrapolation. We show empirically that additivity is crucial for both identifiability and extrapolation on simulated data.

Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation

Sébastien Lachapelle

Divyat Mahajan

Ioannis Mitliagkas

We tackle the problems of latent variables identification and ``out-of-support'' image generation in representation learning. We show that b… (see more)oth are possible for a class of decoders that we call additive, which are reminiscent of decoders used for object-centric representation learning (OCRL) and well suited for images that can be decomposed as a sum of object-specific images. We provide conditions under which exactly solving the reconstruction problem using an additive decoder is guaranteed to identify the blocks of latent variables up to permutation and block-wise invertible transformations. This guarantee relies only on very weak assumptions about the distribution of the latent factors, which might present statistical dependencies and have an almost arbitrarily shaped support. Our result provides a new setting where nonlinear independent component analysis (ICA) is possible and adds to our theoretical understanding of OCRL methods. We also show theoretically that additive decoders can generate novel images by recombining observed factors of variations in novel ways, an ability we refer to as Cartesian-product extrapolation. We show empirically that additivity is crucial for both identifiability and extrapolation on simulated data.

2023-07-05

ArXiv (preprint)

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

Boris Knyazev

DOHA HWANG

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (published)

CrossSplit: Mitigating Label Noise Memorization through Data Splitting

Jihye Kim

Aristide Baratin

Yan Zhang

We approach the problem of improving robustness of deep learning algorithms in the presence of label noise. Building upon existing label cor… (see more)rection and co-teaching methods, we propose a novel training procedure to mitigate the memorization of noisy labels, called CrossSplit, which uses a pair of neural networks trained on two disjoint parts of the labeled dataset. CrossSplit combines two main ingredients: (i) Cross-split label correction. The idea is that, since the model trained on one part of the data cannot memorize example-label pairs from the other part, the training labels presented to each network can be smoothly adjusted by using the predictions of its peer network; (ii) Cross-split semi-supervised training. A network trained on one part of the data also uses the unlabeled inputs of the other part. Extensive experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and mini-WebVision datasets demonstrate that our method can outperform the current state-of-the-art in a wide range of noise ratios. The project page is at https://rlawlgul.github.io/.

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (published)

Unlocking Slot Attention by Changing Optimal Transport Costs

Yan Zhang

David W Zhang

Gertjan J. Burghouts

Cees G. M. Snoek

Slot attention is a powerful method for object-centric modeling in images and videos. However, its set-equivariance limits its ability to ha… (see more)ndle videos with a dynamic number of objects because it cannot break ties. To overcome this limitation, we first establish a connection between slot attention and optimal transport. Based on this new perspective we propose **MESH** (Minimize Entropy of Sinkhorn): a cross-attention module that combines the tiebreaking properties of unregularized optimal transport with the speed of regularized optimal transport. We evaluate slot attention using MESH on multiple object-centric learning benchmarks and find significant improvements over slot attention in every setting.

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (published)

On the Identifiability of Quantized Factors

Vitória Barin Pacela

Kartik Ahuja

Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the … (see more)theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.

2023-06-28

ArXiv (preprint)

Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection

Vitória Barin-Pacela

Kartik Ahuja

Disentanglement aims to recover meaningful latent ground-truth factors from only the observed distribution. Identifiability provides the the… (see more)oretical grounding for disentanglement to be well-founded. Unfortunately, unsupervised identifiability of independent latent factors is a theoretically proven impossibility in the i.i.d. setting under a general nonlinear smooth map from factors to observations. In this work, we show that, remarkably, it is possible to recover discretized latent coordinates under a highly generic nonlinear smooth mapping (a diffeomorphism) without any additional inductive bias on the mapping. This is, assuming that latent density has axis-aligned discontinuity landmarks, but without making the unrealistic assumption of statistical independence of the factors. We introduce this novel form of identifiability, termed quantized coordinate identifiability , and provide a comprehensive proof of the recovery of discretized coordinates.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection

Vitória Barin-Pacela

Kartik Ahuja

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)