Portrait de Yoshua Bengio

Yoshua Bengio

Membre académique principal

Chaire en IA Canada-CIFAR

Professeur titulaire, Université de Montréal, Département d'informatique et de recherche opérationnelle

Fondateur et Conseiller scientifique, Équipe de direction

Sujets de recherche

Apprentissage automatique médical

Apprentissage de représentations

Apprentissage par renforcement

Apprentissage profond

Causalité

Modèles génératifs

Modèles probabilistes

Modélisation moléculaire

Neurosciences computationnelles

Raisonnement

Réseaux de neurones en graphes

Réseaux de neurones récurrents

Théorie de l'apprentissage automatique

Traitement du langage naturel

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Cassidy MacNeil, adjointe principale et responsable des opérations cassidy.macneil@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Collaborateur·rice alumni - McGill

Collaborateur·rice de recherche - Cambridge University

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Visiteur de recherche indépendant

Co-superviseur⋅e :

Guillaume Lajoie

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Collaborateur·rice de recherche - KAIST

Aniket Didolkar

Doctorat - UdeM

Abdessamad EL KABID

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Loubna Benabbou

Desmond Elliott

Visiteur de recherche indépendant

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Guillaume Lajoie

Doctorat - UdeM

Jean-Pierre Falet

Doctorat - UdeM

Doctorat

Doctorat - UdeM

Doctorat - UdeM

Thomas Jiralerspong

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Guillaume Lajoie

Younesse Kaddar

Collaborateur·rice alumni - UdeM

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Alex Hernández-García

Tabitha Edith Lee

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni

Collaborateur·rice alumni - UdeM

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Guillaume Lajoie

Visiteur de recherche indépendant - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Collaborateur·rice de recherche - University of Waterloo

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems

Collaborateur·rice de recherche - UdeM

Co-superviseur⋅e :

Loubna Benabbou

Jarrid Rector-Brooks

Doctorat - UdeM

Postdoctorat - UdeM

Postdoctorat - UdeM

Camille Rochefort-Boulanger

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Dragos Secrieru

Collaborateur·rice alumni - UdeM

Postdoctorat

Co-superviseur⋅e :

Alex Hernández-García

Collaborateur·rice alumni - Polytechnique

Co-superviseur⋅e :

Pierre-Luc Bacon

Mélisande Astrid Crystal Teng

Doctorat - UdeM

Co-superviseur⋅e :

Hugo Larochelle

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Siddarth Venkatraman

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche

Collaborateur·rice de recherche - UdeM

Doctorat - UdeM

Doctorat - McGill

Superviseur⋅e principal⋅e :

Mathieu Blanchette

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Aaron Courville

Collaborateur·rice alumni - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Scaling in the service of reasoning & model-based ML

4 avril 2023

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

par

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

par

Jake P. Taylor-King

Generative Flow Networks

15 mars 2022

Les réseaux de flot génératifs

par

Publications

A Comparative Study of Molecular Dynamics Approaches for Simulating Ionic Conductivity in Solid Lithium Electrolytes

Dounia Shaaban Kabakibo

Félix Therrien

Hongyu Guo

Homin Shin

Alex Hernández-García

Accurate prediction of ionic conductivity is critical for the design of highperformance solid-state electrolytes in next-generation batterie… (voir plus)s. We benchmark molecular dynamics (MD) approaches for computing ionic conductivity in 21 lithium solid electrolytes for which experimental ionic conductivity has been previously reported in the literature. Specifically, we compare simulations driven by density functional theory (DFT) and by universal machine-learning interatomic potentials (uMLIPs), namely a MACE foundation model. Our results suggest comparable performance between DFT and MACE, with MACE requiring only a fraction of the computational cost. The framework developed here is designed to enable systematic comparisons with additional uMLIPs and fine-tuned models in future work.

2026-03-01

AI4Mat @ International Conference on Learning Representations (poster)

Navigating ternary doping in Li-ion cathodes with closed-loop multi-objective Bayesian optimization

Nooshin Zeinali Galabi

Cheng-Hao Liu

Marc Kamel

Shipeng Jia

Eric McCalla

To further improve secondary battery materials, we are increasingly exploring highly complex composition spaces in attempts to optimize mult… (voir plus)iple properties simultaneously. While our past work has done this in systematic manners using high-throughput experimentation, the exponential increase in the search space with triple doping makes grid search prohibitively expensive. Here, we demonstrate a closed-loop, multi-objective machine learning approach to guide the high-throughput workflow to efficiently navigate a space with approximately 14 million unique combinations. The test system is LiCoPO4 which we have previously explored using systematic codoping that was effective in optimizing one property only: energy density. To learn multiple electrochemical metrics, we first pretrain a set transformer on the public Materials Project database as a feature extractor, then attach a multi-task Gaussian process head and finetune the entire model on our high-throughput data. Through 3 rounds of active learning, we demonstrate that with a very small number of samples (as few as 125 random compositions and 63 predicted) we are able to simultaneously optimize four key electrochemical properties. Relative to the undoped system, the best composition raises our composite figure of merit by up to five times. This establishes an end-to-end workflow for accelerated battery materials design to be used in the rapidly growing field of autonomous materials discovery.

2026-02-11

Advances in Materials (publié)

Synthesizable Molecular Generation via Soft-constrained GFlowNets with Rich Chemical Priors

D. Biton

Louis Vaillancourt

Yves V. Brun

Alex Hernández-García

2026-02-03

arXiv (prépublication)

Divergent creativity in humans and large language models

Antoine Bellemare-Pepin

François Lespinasse

Philipp Thölke

Yann Harel

Kory Mathewson

Jay A. Olson

Psychology Department

U. Montr'eal

Montreal

Qc

Canada

Music department

C. University

Sociology

Anthropology department

Mila

Departmentof Psychology

University of Toronto Mississauga … (voir 5 de plus)

Mississauga

On

Department of Computer Science

Operations Research

Unique Center

The recent surge of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilitie… (voir plus)s. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLMs’ semantic diversity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in computational creativity to analyze semantic divergence in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. These divergence-based measures index associative thinking—the ability to access and combine remote concepts in semantic space—an established facet of creative cognition. We benchmark performance on the Divergent Association Task (DAT) and across multiple creative-writing tasks (haiku, story synopses, and flash fiction), using identical, objective scoring. We found evidence that LLMs can surpass average human performance on the DAT, and approach human creative writing abilities, yet they remain below the mean creativity scores observed among the more creative segment of human participants. Notably, even the top performing LLMs are still largely surpassed by the aggregated top half of human participants, underscoring a ceiling that current LLMs still fail to surpass. We also systematically varied linguistic strategy prompts and temperature, observing reliable gains in semantic divergence for several models. Our human-machine benchmarking framework addresses the polemic surrounding the imminent replacement of human creative labor by AI, disentangling the quality of the respective creative linguistic outputs using established objective measures. While prompting deeper exploration of the distinctive elements of human inventive thought compared to those of AI systems, we lay out a series of techniques to improve their outputs with respect to semantic diversity, such as prompt design and hyper-parameter tuning.

2026-01-20

Scientific Reports (publié)

Discrete Feynman-Kac Correctors

Viktor Ohanesian

Artem Gazizov

Alán Aspuru-Guzik

Roberto Bondesan

Kirill Neklyudov

Discrete diffusion models have recently emerged as a promising alternative to the autoregressive approach for generating discrete sequences.… (voir plus) Sample generation via gradual denoising or demasking processes allows them to capture hierarchical non-sequential interdependencies in the data. These custom processes, however, do not assume a flexible control over the distribution of generated samples. We propose Discrete Feynman-Kac Correctors, a framework that allows for controlling the generated distribution of discrete masked diffusion models at inference time. We derive Sequential Monte Carlo (SMC) algorithms that, given a trained discrete diffusion model, control the temperature of the sampled distribution (i.e. perform annealing), sample from the product of marginals of several diffusion processes (e.g. differently conditioned processes), and sample from the product of the marginal with an external reward function, producing likely samples from the target distribution that also have high reward. Notably, our framework does not require any training of additional models or fine-tuning of the original model. We illustrate the utility of our framework in several applications including: efficient sampling from the annealed Boltzmann distribution of the Ising model, improving the performance of language models for code generation and amortized learning, as well as reward-tilted protein sequence generation.

2026-01-14

arXiv (prépublication)

In-Context Reinforcement Learning through Bayesian Fusion of Context and Value Prior

Anaïs Berkes

In-context reinforcement learning (ICRL) promises fast adaptation to unseen environments without parameter updates, but current methods eith… (voir plus)er cannot improve beyond the training distribution or require near-optimal data, limiting practical adoption. We introduce SPICE, a Bayesian ICRL method that learns a prior over Q-values via deep ensemble and updates this prior at test-time using in-context information through Bayesian updates. To recover from poor priors resulting from training on sub-optimal data, our online inference follows an Upper-Confidence Bound rule that favours exploration and adaptation. We prove that SPICE achieves regret-optimal behaviour in both stochastic bandits and finite-horizon MDPs, even when pretrained only on suboptimal trajectories. We validate these findings empirically across bandit and control benchmarks. SPICE achieves near-optimal decisions on unseen tasks, substantially reduces regret compared to prior ICRL and meta-RL approaches while rapidly adapting to unseen tasks and remaining robust under distribution shift.

2025-12-31

arXiv (prépublication)

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

Johan Obando-Ceron

Brian Bartoldson

Bhavya Kailkhura

Pablo Samuel Castro

Siddarth Venkatraman

Aaron Courville

The reasoning performance of large language models (LLMs) can be substantially improved by training them with reinforcement learning (RL). T… (voir plus)he RL objective for LLM training involves a regularization term, which is the reverse Kullback-Leibler (KL) divergence between the trained policy and the reference policy. Since computing the KL divergence exactly is intractable, various estimators are used in practice to estimate it from on-policy samples. Despite its wide adoption, including in several open-source libraries, there is no systematic study analyzing the numerous ways of incorporating KL estimators in the objective and their effect on the downstream performance of RL-trained models. Recent works show that prevailing practices for incorporating KL regularization do not provide correct gradients for stated objectives, creating a discrepancy between the objective and its implementation. In this paper, we further analyze these practices and study the gradients of several estimators configurations, revealing how design choices shape gradient bias. We substantiate these findings with empirical observations by RL fine-tuning \texttt{Qwen2.5-7B}, \texttt{Llama-3.1-8B-Instruct} and \texttt{Qwen3-4B-Instruct-2507} with different configurations and evaluating their performance on both in- and out-of-distribution tasks. Through our analysis, we observe that, in on-policy settings: (1) estimator configurations with biased gradients can result in training instabilities; and (2) using estimator configurations resulting in unbiased gradients leads to better performance on in-domain as well as out-of-domain tasks. We also investigate the performance resulting from different KL configurations in off-policy settings and observe that KL regularization can help stabilize off-policy RL training resulting from asynchronous setups.

2025-12-25

ArXiv (prépublication)

Hidden sampling biases inflate performance in gene regulatory network inference

Florin Ratajczak

Eva Hoermanseder

Jason Hartford

Pascal Falter-Braun

Matthias Heinig

Antonio Scialdone

Accurate reconstruction of gene regulatory networks (GRNs) from single-cell transcriptomic data remains a major methodological challenge. Re… (voir plus)cent machine learning approaches, particularly graph neural networks and graph autoencoders, have reported improved performance, yet these gains do not consistently translate to realistic biological settings. Here, we show that a key reason for that is the way negative regulatory interactions are sampled for supervised training and evaluation. We find that widely used sampling strategies introduce node-degree biases that allow models to exploit trivial graph-structural cues rather than biological signals. Across multiple benchmarks, simple degree-based heuristics match or exceed state-of-the-art graph neural network models under these biased evaluation protocols. We further introduce a degree-aware sampling approach that eliminates these artifacts and provides more reliable assessments of GRN inference methods. Our results call for standardized, bias-aware benchmarking practices to ensure meaningful progress in supervised GRN inference from single-cell RNA-seq data.

2025-12-22

bioRxiv (prépublication)

A Message from AI Research Leaders: Join Us in Supporting OpenReview

Andrew Y. Ng

Ruslan Salakhutdinov

Fernando Pereira

2025-12-17

OpenReview (inconnu)

International AI Safety Report Second Key Update: Technical Safeguards and Risk Management

Stephen Clare

Carina Prunkl

Maksym Andriushchenko

BEN BUCKNALL

Philip Fox

Nestor Maslej

Conor McGlynn

Malcolm Murray

Shalaleh Rismani

Stephen Casper

Jessica Newman

Daniel Privitera

Sören Mindermann

Daron Acemoglu

Thomas G. Dietterich

Fredrik Heintz

Geoffrey Hinton

Nick Jennings

Susan Leavy … (voir 17 de plus)

Teresa Ludermir

Vidushi Marda

Helen Margetts

John McDermid

Jane Munga

Arvind Narayanan

Alondra Nelson

Clara Neppel

Sarvapali D. (Gopal) Ramchurn

Stuart Russell

Marietje Schaake

Bernhard Schölkopf

Alvaro Soto

Lee Tiedrich

Andrew Yao

Ya-Qin Zhang

This is the Second Key Update to the 2025 International AI Safety Report. The First Key Update (1) discussed developments in the capabilitie… (voir plus)s of general-purpose AI models and systems and associated risks. This Key Update covers how various actors, including researchers, companies, and governments, are approaching risk management and technical mitigations for AI. The past year has seen important developments in AI risk management, including better techniques for training safer models and monitoring their outputs. While this represents tangible progress, significant gaps remain. It is often uncertain how effective current measures are at preventing harms, and effectiveness varies across time and applications. There are many opportunities to further strengthen existing safeguard techniques and to develop new ones. This Key Update provides a concise overview of critical developments in risk management practices and technical risk mitigation since the publication of the 2025 AI Safety Report in January. It highlights where progress is being made and where gaps remain. Above all, it aims to support policymakers, researchers, and the public in navigating a rapidly changing environment, helping them to make informed and timely decisions about the governance of general-purpose AI. Professor Yoshua BengioUniversité de Montréal / LawZero /Mila – Quebec AI Institute & Chair

2025-12-06

SuperIntelligence - Robotics - Safety & Alignment (publié)

Adsorption energies are necessary but not sufficient to identify good catalysts

Shahana Chatterjee

Alexander Davis

Alexandre AGM Duval

Oleksandr Voznyy

Alex Hern'andez-Garcia

Félix Therrien

2025-12-04

ArXiv (prépublication)

FALCON: Few-step Accurate Likelihoods for Continuous Flows

Tara Akhound-Sadegh

Artem Gazizov

2025-11-30

arXiv (publié)