Yoshua Bengio

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Cassidy MacNeil, adjointe principale et responsable des opérations cassidy.macneil@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Collaborateur·rice alumni - McGill

Berkes Anaïs

Collaborateur·rice de recherche - Cambridge University

Superviseur⋅e principal⋅e :

Rim Assouel

Doctorat - UdeM

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Collaborateur·rice de recherche - KAIST

Doctorat - UdeM

Visiteur de recherche indépendant

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - UdeM

Doctorat

Doctorat - UdeM

Moksh Jain

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - UdeM

Hyeonah Kim

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Alex Hernández-García

Minsu Kim

Collaborateur·rice de recherche - UdeM

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni

Song LIU

Collaborateur·rice de recherche - s.o.

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Dhanya Sridhar

Sarthak Mittal

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Visiteur de recherche indépendant - UdeM

Padideh Nouri

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Ali Parviz

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Camille Rochefort-Boulanger

Lena Podina

Collaborateur·rice de recherche - University of Waterloo

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Postdoctorat - UdeM

Postdoctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Julie Hussin

Divya Sharma

Postdoctorat

Co-superviseur⋅e :

Alex Hernández-García

Mélisande Astrid Crystal Teng

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Hugo Larochelle

Ivan Titov

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

Siva Reddy

Alex Tong

Collaborateur·rice alumni - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - UdeM

Collaborateur·rice de recherche

Collaborateur·rice de recherche - UdeM

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Nicole Zhang

Doctorat - McGill

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Harry Zhao

Collaborateur·rice alumni - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

Scaling in the service of reasoning & model-based ML

4 avril 2023

par

Yoshua Bengio

Edward J. Hu

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

par

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

Les réseaux de flot génératifs

15 mars 2022

par

Yoshua Bengio

Publications

Discrete Feynman-Kac Correctors

Mohsin Hasan

Viktor Ohanesian

Artem Gazizov

Alán Aspuru-Guzik

Roberto Bondesan

Marta Skreta

Kirill Neklyudov

Discrete diffusion models have recently emerged as a promising alternative to the autoregressive approach for generating discrete sequences.… (voir plus) Sample generation via gradual denoising or demasking processes allows them to capture hierarchical non-sequential interdependencies in the data. These custom processes, however, do not assume a flexible control over the distribution of generated samples. We propose Discrete Feynman-Kac Correctors, a framework that allows for controlling the generated distribution of discrete masked diffusion models at inference time. We derive Sequential Monte Carlo (SMC) algorithms that, given a trained discrete diffusion model, control the temperature of the sampled distribution (i.e. perform annealing), sample from the product of marginals of several diffusion processes (e.g. differently conditioned processes), and sample from the product of the marginal with an external reward function, producing likely samples from the target distribution that also have high reward. Notably, our framework does not require any training of additional models or fine-tuning of the original model. We illustrate the utility of our framework in several applications including: efficient sampling from the annealed Boltzmann distribution of the Ising model, improving the performance of language models for code generation and amortized learning, as well as reward-tilted protein sequence generation.

2026-01-14

arXiv (prépublication)

AI Epistemic Risks: Emerging Mechanisms &amp; Evidence

Mick Yang

Stephen Casper

Jonathan Stray

Jasmine Li

Cameron Jones

Anna Gausen

Natasha Jacques

Brian Christian

Bálint Gyevnár

Hannah Rose Kirk

ZHONGHAO HE

Dan Zhao (285025)

Siao Si Looi

J. Levy

Kobi Hackenburg

Elizabeth Seger

Matt Kowal

Michelle Malonza

Luke Hewitt

Hause Lin … (voir 10 de plus)

Maarten Sap

Dylan Hadfield-Menell

Thomas Costello

Reihaneh Rabbany

Jean-François Godbout

David Rand

Atoosa Kasirzadeh

Gordon Pennycook

Kellin Pelrine

2025-12-31

SSRN Electronic Journal (accepté)

In-Context Reinforcement Learning through Bayesian Fusion of Context and Value Prior

Anaïs Berkes

In-context reinforcement learning (ICRL) promises fast adaptation to unseen environments without parameter updates, but current methods eith… (voir plus)er cannot improve beyond the training distribution or require near-optimal data, limiting practical adoption. We introduce SPICE, a Bayesian ICRL method that learns a prior over Q-values via deep ensemble and updates this prior at test-time using in-context information through Bayesian updates. To recover from poor priors resulting from training on sub-optimal data, our online inference follows an Upper-Confidence Bound rule that favours exploration and adaptation. We prove that SPICE achieves regret-optimal behaviour in both stochastic bandits and finite-horizon MDPs, even when pretrained only on suboptimal trajectories. We validate these findings empirically across bandit and control benchmarks. SPICE achieves near-optimal decisions on unseen tasks, substantially reduces regret compared to prior ICRL and meta-RL approaches while rapidly adapting to unseen tasks and remaining robust under distribution shift.

2025-12-31

arXiv (prépublication)

Integrating Generative and Experimental Platforms for Biomolecular Design

Chenghao Liu

Jarrid Rector-Brooks

Soojung Yang

Sidney Lisanza

Jacob Gershon

Lauren Hong

Pranam Chatterjee

Biomolecular design, through artificial engineering of proteins, ligands, nucleic acids, and cells, holds immense promise in addressing pres… (voir plus)sing medical, industrial, and environmental challenges. While generative machine learning has shown significant potential in this area, a disconnect exists with experimental biology: many ML research efforts prioritize static benchmark performance, potentially sidelining impactful biological applications. This workshop seeks to bridge this gap by bringing computationalists and experimentalists together, catalyzing a deeper interdisciplinary discourse. Together, we will explore the strengths and challenges of generative ML in biology, experimental integration of generative ML, and biological problems ready for ML. To attract high-quality and diverse research, we partnered with Nature Biotechnology for a special collection, and we created dedicated tracks for in-silico ML research and hybrid ML-experimental biology research. Our lineup features emerging leaders as speakers and renowned scientists as panelists, encapsulating a spectrum from high-throughput experimentation and computational biology to generative ML. To catalyze new collaborations, we will host a seed-grant competition for pairs of experimentalists and computationalists proposing fresh joint projects. To connect dry and wet lab practice, a wet-lab challenge sponsored by Adaptyv Bio will empirically evaluate protein design models. With a diverse organizing team and backed by industry sponsors, we dedicate the workshop to pushing the boundaries of ML's role in biology. This will be the third edition of this workshop following the previous versions of it we organized at ICLR 2024 and 2025.

2025-12-31

Workshop Proposals @ International Conference on Learning Representations (publié)

openreview.net

The Silent Thought: Modeling Internal Cognition in Full-Duplex Spoken Dialogue Models via Latent Reasoning

Donghang Wu

Tianyu Zhang

Yuxin Li

Hexin Liu

Chen Chen

Eng Siong Chng

During conversational interactions, humans subconsciously engage in concurrent thinking while listening to a speaker. Although this internal… (voir plus) cognitive processing may not always manifest as explicit linguistic structures, it is instrumental in formulating high-quality responses. Inspired by this cognitive phenomenon, we propose a novel **F**ull-duplex **LA**tent and **I**nternal **R**easoning method named FLAIR that conducts *latent* thinking simultaneously with speech perception. Unlike conventional "thinking" mechanisms in NLP, which require post-hoc generation, our approach aligns seamlessly with spoken dialogue systems: during the user’s speaking phase, it recursively feeds the latent embedding output from the previous step into the next step, enabling continuous reasoning that strictly adheres to causality without introducing additional latency. To enable this latent reasoning, we design an Evidence Lower Bound-based objective that supports efficient supervised finetuning via teacher forcing, circumventing the need for explicit reasoning annotations. Experiments demonstrate the effectiveness of this think-while-listening design, which achieves competitive results on a range of speech benchmarks. Furthermore, FLAIR robustly handles conversational dynamics and attains competitive performance on full-duplex interaction metrics.

2025-12-31

International Conference on Machine Learning (Accept (regular))

openreview.net

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

Vedant Shah

Johan Obando-Ceron

Vineet Jain

Brian Bartoldson

Bhavya Kailkhura

Nikolay Malkin

The reasoning performance of large language models (LLMs) can be substantially improved by training them with reinforcement learning (RL). T… (voir plus)he RL objective for LLM training involves a regularization term, which is the reverse Kullback-Leibler (KL) divergence between the trained policy and the reference policy. Since computing the KL divergence exactly is intractable, various estimators are used in practice to estimate it from on-policy samples. Despite its wide adoption, including in several open-source libraries, there is no systematic study analyzing the numerous ways of incorporating KL estimators in the objective and their effect on the downstream performance of RL-trained models. Recent works show that prevailing practices for incorporating KL regularization do not provide correct gradients for stated objectives, creating a discrepancy between the objective and its implementation. In this paper, we further analyze these practices and study the gradients of several estimators configurations, revealing how design choices shape gradient bias. We substantiate these findings with empirical observations by RL fine-tuning \texttt{Qwen2.5-7B}, \texttt{Llama-3.1-8B-Instruct} and \texttt{Qwen3-4B-Instruct-2507} with different configurations and evaluating their performance on both in- and out-of-distribution tasks. Through our analysis, we observe that, in on-policy settings: (1) estimator configurations with biased gradients can result in training instabilities; and (2) using estimator configurations resulting in unbiased gradients leads to better performance on in-domain as well as out-of-domain tasks. We also investigate the performance resulting from different KL configurations in off-policy settings and observe that KL regularization can help stabilize off-policy RL training resulting from asynchronous setups.

2025-12-25

ArXiv (prépublication)

Hidden sampling biases inflate performance in gene regulatory network inference

Marco Stock

Florin Ratajczak

Paul Bertin

Eva Hoermanseder

Jason Hartford

Pascal Falter-Braun

Matthias Heinig

Alexander Tong

Antonio Scialdone

Accurate reconstruction of gene regulatory networks (GRNs) from single-cell transcriptomic data remains a major methodological challenge. Re… (voir plus)cent machine learning approaches, particularly graph neural networks and graph autoencoders, have reported improved performance, yet these gains do not consistently translate to realistic biological settings. Here, we show that a key reason for that is the way negative regulatory interactions are sampled for supervised training and evaluation. We find that widely used sampling strategies introduce node-degree biases that allow models to exploit trivial graph-structural cues rather than biological signals. Across multiple benchmarks, simple degree-based heuristics match or exceed state-of-the-art graph neural network models under these biased evaluation protocols. We further introduce a degree-aware sampling approach that eliminates these artifacts and provides more reliable assessments of GRN inference methods. Our results call for standardized, bias-aware benchmarking practices to ensure meaningful progress in supervised GRN inference from single-cell RNA-seq data.

2025-12-22

bioRxiv (prépublication)

A Message from AI Research Leaders: Join Us in Supporting OpenReview

Andrew Y. Ng

Ruslan Salakhutdinov

Fernando Pereira

2025-12-17

OpenReview (inconnu)

openreview.net

Sliding Window Recurrences for Sequence Models

Dragos Secrieru

Garyk Brixi

Taiji Suzuki

Michael Poli

Stefano Massaroli

Multi-hybrid architectures are poised to take over language modeling due to better quality and performance. We introduce a hierarchical deco… (voir plus)mposition framework for linear recurrences that allows us to develop algorithms aligned with GPU memory hierarchies, yielding Sliding Window Recurrences. We focus specifically on truncating recurrences to hardware-aligned windows which are naturally jagged, limiting costly inter-warp communication. Using SWR, we develop Phalanx layers that serve as drop-in replacements for windowed attention or linear recurrences. In 1B parameter multi-hybrid models, Phalanx achieves over 10-40% speedup across 4K to 32K context length over optimized Transformers while matching perplexity.

2025-12-14

arXiv (prépublication)

FALCON: Few-step Accurate Likelihoods for Continuous Flows

Artem Gazizov

2025-12-09

ArXiv (prépublication)

International AI Safety Report Second Key Update: Technical Safeguards and Risk Management

Stephen Clare

Carina Prunkl

Maksym Andriushchenko

BEN BUCKNALL

Philip Fox

Nestor Maslej

Conor McGlynn

Malcolm Murray

Shalaleh Rismani

Stephen Casper

Jessica Newman

Daniel Privitera

Sören Mindermann

Daron Acemoglu

Thomas G. Dietterich

Fredrik Heintz

Geoffrey Hinton

Nick Jennings

Susan Leavy … (voir 17 de plus)

Teresa Ludermir

Vidushi Marda

Helen Margetts

John McDermid

Jane Munga

Arvind Narayanan

Alondra Nelson

Clara Neppel

Sarvapali D. (Gopal) Ramchurn

Stuart Russell

Marietje Schaake

Bernhard Schölkopf

Alvaro Soto

Lee Tiedrich

Gael Varoquaux

Andrew Yao

Ya-Qin Zhang

This is the Second Key Update to the 2025 International AI Safety Report. The First Key Update (1) discussed developments in the capabilitie… (voir plus)s of general-purpose AI models and systems and associated risks. This Key Update covers how various actors, including researchers, companies, and governments, are approaching risk management and technical mitigations for AI. The past year has seen important developments in AI risk management, including better techniques for training safer models and monitoring their outputs. While this represents tangible progress, significant gaps remain. It is often uncertain how effective current measures are at preventing harms, and effectiveness varies across time and applications. There are many opportunities to further strengthen existing safeguard techniques and to develop new ones. This Key Update provides a concise overview of critical developments in risk management practices and technical risk mitigation since the publication of the 2025 AI Safety Report in January. It highlights where progress is being made and where gaps remain. Above all, it aims to support policymakers, researchers, and the public in navigating a rapidly changing environment, helping them to make informed and timely decisions about the governance of general-purpose AI. Professor Yoshua BengioUniversité de Montréal / LawZero /Mila – Quebec AI Institute & Chair

2025-12-06

SuperIntelligence - Robotics - Safety & Alignment (publié)