Portrait of Pascal Vincent

Pascal Vincent

Core Industry Member
Associate Professor, Université de Montréal, Department of Computer Science and Operations Research
Research Scientist, Facebook AI Research (FAIR) Montréal
Research Topics
Deep Learning
Representation Learning

Biography

Pascal Vincent is a research scientist in the Fundamental AI Research (FAIR) team at Meta and an adjunct professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal.

He is also a founding member of Mila – Quebec Artificial Intelligence Institute and an associate fellow in CIFAR’s Learning in Machines & Brains program.

Vincent’s research on principles and algorithms in representation learning led him to uncover several seminal ideas that became key enablers for the successes of deep learning methods. Among his most influential contributions is the seminal paper on neural language models “A Neural Probabilistic Language Model” (Bengio et al. 2013), which laid the foundations on which all artificial neural network based language models are built.

His work on denoising autoencoders (Vincent et al. 2008, 2010) was the first to propose the pretext task of filling in artificially introduced blanks for the sake of learning useful representations in any modality, a precursor of what is today called self-supervised learning.

In another seminal paper, “A Connection Between Score Matching and Denoising Autoencoders” (Vincent 2011), he developed the “denoising score matching” principle, which is now routinely used to train diffusion-based generative models.

Vincent’s current research focuses on novel theory and algorithms for representation learning to enable robust generalization out-of-distribution.

Current Students

PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Independent visiting researcher

Publications

Steering Large Language Model Activations in Sparse Spaces
Reza Bayat
Ali Rahimi-Kalahroudi
Mohammad Pezeshki
Steering Large Language Model Activations in Sparse Spaces
Reza Bayat
Ali Rahimi-Kalahroudi
Mohammad Pezeshki
A key challenge in AI alignment is guiding large language models (LLMs) to follow desired behaviors at test time. Activation steering, which… (see more) modifies internal model activations during inference, offers a potential solution. However, prior work in dense activation spaces struggles with superposition, wherein multiple features become entangled, limiting interpretability and precise control. In contrast, sparse representations provide an untapped opportunity for more interpretable behavior modulation. In this work, we introduce sparse activation steering (SAS), a method that leverages sparse autoencoders (SAEs) to steer LLM behavior in sparse spaces. By isolating behavior-specific features through a contrastive prompt-pairing approach, we define a set of features that can selectively reinforce or suppress behaviors. Experiments on Gemma 2 LLMs show that SAS vectors enable nuanced behavioral modulation and finer-grained control. Furthermore, scaling SAEs improves monosemanticity of SAS vectors, suggesting more reliable and interpretable interventions.
MaestroMotif: Skill Design from Artificial Intelligence Feedback
Martin Klissarov
Mikael Henaff
Roberta Raileanu
Shagun Sodhani
Amy Zhang
Marlos C. Machado
Pierluca D'Oro
Describing skills in natural language has the potential to provide an accessible way to inject human knowledge about decision-making into an… (see more) AI system. We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents. MaestroMotif leverages the capabilities of Large Language Models (LLMs) to effectively create and reuse skills. It first uses an LLM's feedback to automatically design rewards corresponding to each skill, starting from their natural language description. Then, it employs an LLM's code generation abilities, together with reinforcement learning, for training the skills and combining them to implement complex behaviors specified in language. We evaluate MaestroMotif using a suite of complex tasks in the NetHack Learning Environment (NLE), demonstrating that it surpasses existing approaches in both performance and usability.
The Pitfalls of Memorization: When Memorization Hurts Generalization
Reza Bayat
Mohammad Pezeshki
David Lopez-Paz
Neural networks often learn simple explanations that fit the majority of the data while memorizing exceptions that deviate from these explan… (see more)ations.This behavior leads to poor generalization when the learned explanations rely on spurious correlations. In this work, we formalize the interplay between memorization and generalization, showing that spurious correlations would particularly lead to poor generalization when are combined with memorization. Memorization can reduce training loss to zero, leaving no incentive to learn robust, generalizable patterns. To address this, we propose memorization-aware training (MAT), which uses held-out predictions as a signal of memorization to shift a model's logits. MAT encourages learning robust patterns invariant across distributions, improving generalization under distribution shifts.
MaestroMotif: Skill Design from Artificial Intelligence Feedback
Martin Klissarov
Mikael Henaff
Roberta Raileanu
Shagun Sodhani
Amy Zhang
Marlos C. Machado
Pierluca D'Oro
Describing skills in natural language has the potential to provide an accessible way to inject human knowledge about decision-making into an… (see more) AI system. We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents. MaestroMotif leverages the capabilities of Large Language Models (LLMs) to effectively create and reuse skills. It first uses an LLM's feedback to automatically design rewards corresponding to each skill, starting from their natural language description. Then, it employs an LLM's code generation abilities, together with reinforcement learning, for training the skills and combining them to implement complex behaviors specified in language. We evaluate MaestroMotif using a suite of complex tasks in the NetHack Learning Environment (NLE), demonstrating that it surpasses existing approaches in both performance and usability.
MaestroMotif: Skill Design from Artificial Intelligence Feedback
Martin Klissarov
Mikael Henaff
Roberta Raileanu
Shagun Sodhani
Amy Zhang
Marlos C. Machado
Pierluca D'Oro
Describing skills in natural language has the potential to provide an accessible way to inject human knowledge about decision-making into an… (see more) AI system. We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents. MaestroMotif leverages the capabilities of Large Language Models (LLMs) to effectively create and reuse skills. It first uses an LLM's feedback to automatically design rewards corresponding to each skill, starting from their natural language description. Then, it employs an LLM's code generation abilities, together with reinforcement learning, for training the skills and combining them to implement complex behaviors specified in language. We evaluate MaestroMotif using a suite of complex tasks in the NetHack Learning Environment (NLE), demonstrating that it surpasses existing approaches in both performance and usability.
Compositional Risk Minimization
Divyat Mahajan
Mohammad Pezeshki
Kartik Ahuja
The Pitfalls of Memorization: When Memorization Hinders Generalization
Reza Bayat
Mohammad Pezeshki
David Lopez-Paz
Neural networks often learn simple explanations that fit the majority of the data while memorizing exceptions that deviate from these explan… (see more)ations. This leads to poor generalization when the learned explanations are spurious. In this work, we formalize
The Pitfalls of Memorization: When Memorization Hinders Generalization
Reza Bayat
Mohammad Pezeshki
David Lopez-Paz
Neural networks often learn simple explanations that fit the majority of the data while memorizing exceptions that deviate from these explan… (see more)ations. This leads to poor generalization when the learned explanations are spurious. In this work, we formalize
Stochastic positional embeddings improve masked image modeling
Amir Bar
Florian Bordes
Assaf Shocher
Mahmoud Assran
Nicolas Ballas
Trevor Darrell
Amir Globerson
Yann LeCun
Stochastic positional embeddings improve masked image modeling
Amir Bar
Florian Bordes
Assaf Shocher
Mahmoud Assran
Nicolas Ballas
Trevor Darrell
Amir Globerson
Yann LeCun
Masked Image Modeling (MIM) is a promising self-supervised learning approach that enables learning from unlabeled images. Despite its recent… (see more) success, learning good representations through MIM remains challenging because it requires predicting the right semantic content in accurate locations. For example, given an incomplete picture of a dog, we can guess that there is a tail, but we cannot determine its exact location. In this work, we propose to incorporate location uncertainty into MIM by using stochastic positional embeddings (StoP). Specifically, we condition the model on stochastic masked token positions drawn from a Gaussian distribution. StoP reduces overfitting to location features and guides the model toward learning features that are more robust to location uncertainties. Quantitatively, StoP improves downstream MIM performance on a variety of downstream tasks, including
On the Identifiability of Quantized Factors
Vitória Barin Pacela
Kartik Ahuja
Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the… (see more) theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.