Yoshua Bengio

Biography

*For media requests, please write to medias@mila.quebec.

For more information please contact Cassidy MacNeil, Senior Assistant and Operation Lead at cassidy.macneil@mila.quebec.

Yoshua Bengio is recognized worldwide as a leading expert in AI. He is most known for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award, “the Nobel Prize of computing,” with Geoffrey Hinton and Yann LeCun.

Bengio is a full professor at Université de Montréal, and the founder and scientific advisor of Mila – Quebec Artificial Intelligence Institute. He is also a senior fellow at CIFAR and co-directs its Learning in Machines & Brains program, serves as special advisor and founding scientific director of IVADO, and holds a Canada CIFAR AI Chair.

In 2019, Bengio was awarded the prestigious Killam Prize and in 2022, he was the most cited computer scientist in the world by h-index. He is a Fellow of the Royal Society of London, Fellow of the Royal Society of Canada, Knight of the Legion of Honor of France and Officer of the Order of Canada. In 2023, he was appointed to the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Concerned about the social impact of AI, Bengio helped draft the Montréal Declaration for the Responsible Development of Artificial Intelligence and continues to raise awareness about the importance of mitigating the potentially catastrophic risks associated with future AI systems.

Current Students

Jamal Abou Haibeh

Collaborating Alumni - McGill University

Berkes Anaïs

Collaborating researcher - Cambridge University

Rim Assouel

PhD - Université de Montréal

Stefan Bauer

Independent visiting researcher

Shahana Chatterjee

Collaborating researcher - N/A

Xiaoyin Chen

PhD - Université de Montréal

Sanghyeok Choi

Collaborating researcher - KAIST

PhD - Université de Montréal

Collaborating Alumni - Université de Montréal

Desmond Elliott

Independent visiting researcher

Eric Elmoznino

PhD - Université de Montréal

PhD - Université de Montréal

Jean-Pierre Falet

PhD - Université de Montréal

PhD

PhD - Université de Montréal

Moksh Jain

PhD - Université de Montréal

PhD - Université de Montréal

Collaborating Alumni - Université de Montréal

Hyeonah Kim

Postdoctorate - Université de Montréal

Tabitha Edith Lee

Postdoctorate - Université de Montréal

Collaborating Alumni

Collaborating Alumni - Université de Montréal

Cristian Dragos Manta

PhD - Université de Montréal

Sarthak Mittal

PhD - Université de Montréal

Independent visiting researcher - Université de Montréal

Padideh Nouri

PhD - Université de Montréal

Ali Parviz

Collaborating researcher - Ying Wu Coll of Computing

Lena Podina

Collaborating researcher - University of Waterloo

Nassim Rahaman

Collaborating Alumni - Max-Planck-Institute for Intelligent Systems

Amine RAZIG

Collaborating researcher - Université de Montréal

Jarrid Rector-Brooks

PhD - Université de Montréal

Danyal REHMAN

Postdoctorate - Université de Montréal

Oli RICHARDSON

Postdoctorate - Université de Montréal

Camille Rochefort-Boulanger

PhD - Université de Montréal

Dragos Secrieru

Collaborating Alumni - Université de Montréal

Postdoctorate

Collaborating Alumni - Polytechnique Montréal

Mélisande Astrid Crystal Teng

PhD - Université de Montréal

Ivan Titov

Collaborating researcher

Alex Tong

Collaborating Alumni - Université de Montréal

Collaborating Alumni - Université de Montréal

PhD - Université de Montréal

Collaborating researcher

Collaborating researcher - Université de Montréal

Tianyu Zhang

PhD - Université de Montréal

PhD - McGill University

PhD - Université de Montréal

Harry Zhao

Collaborating Alumni - McGill University

Skipper: Combining Spatial and Temporal Abstraction for Better Generalization

Blog Posts

Generic thumbnail for Mila Blog articles.

February 22, 2024

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Scaling in the Service of Reasoning & Model-Based ML

April 4, 2023

Yoshua Bengio

Edward J. Hu

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

March 23, 2022

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

March 15, 2022

Generative Flow Networks

Yoshua Bengio

Publications

MAP: Low-Compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

Zhiqi Bu

Huan He

Yonghui Wu

Jiang Bian

Yong Chen

Model merging has emerged as an effective approach to combine multiple single-task models into a multitask model. This process typically inv… (see more)olves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the objectives of different tasks can lead to trade-offs during the merging process. In real-world applications, a set of solutions with various trade-offs can be more informative, helping practitioners make decisions based on diverse preferences. In this paper, we introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP). MAP efficiently identifies a Pareto set of scaling coefficients for merging multiple models, reflecting the trade-offs involved. It amortizes the substantial computational cost of evaluations needed to estimate the Pareto front by using quadratic approximation surrogate models derived from a pre-selected set of scaling coefficients. Experimental results on vision and natural language processing tasks demonstrate that MAP can accurately identify the Pareto front, providing practitioners with flexible solutions to balance competing task objectives. We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.

2025-01-21

ICLR.cc/2025/Conference (poster)

Structure Language Models for Protein Conformation Generation

Stephen Z. Lu

Hongyu Guo

Proteins adopt multiple structural conformations to perform their diverse biological functions, and understanding these conformations is cru… (see more)cial for advancing drug discovery. Traditional physics-based simulation methods often struggle with sampling equilibrium conformations and are computationally expensive. Recently, deep generative models have shown promise in generating protein conformations as a more efficient alternative. However, these methods predominantly rely on the diffusion process within a 3D geometric space, which typically centers around the vicinity of metastable states and is often inefficient in terms of runtime. In this paper, we introduce Structure Language Modeling (SLM) as a novel framework for efficient protein conformation generation. Specifically, the protein structures are first encoded into a compact latent space using a discrete variational auto-encoder, followed by conditional language modeling that effectively captures sequence-specific conformation distributions. This enables a more efficient and interpretable exploration of diverse ensemble modes compared to existing methods. Based on this general framework, we instantiate SLM with various popular LM architectures as well as proposing the ESMDiff, a novel BERT-like structure language model fine-tuned from ESM3 with masked diffusion. We verify our approach in various scenarios, including the equilibrium dynamics of BPTI, conformational change pairs, and intrinsically disordered proteins. SLM provides a highly efficient solution, offering a 20-100x speedup than existing methods in generating diverse conformations, shedding light on promising avenues for future research.

2025-01-21

ICLR.cc/2025/Conference (poster)

On the Transfer of Object-Centric Representation Learning.

Aniket Rajiv Didolkar

Andrii Zadaianchuk

Anirudh Goyal

Michael Curtis Mozer

Georg Martius

Maximilian Seitzer

2025-01-21

ICLR.cc/2025/Conference (poster)

Towards Improving Exploration Through Sibling Augmented GFlowNets

2025-01-21

ICLR.cc/2025/Conference (poster)

Geometric Signatures of Compositionality Across a Language Model's Lifetime

Jin Hwa Lee

Thomas Jiralerspong

Lei Yu

Emily Cheng

Compositionality, the notion that the meaning of an expression is constructed from the meaning of its parts and syntactic rules, permits the… (see more) infinite productivity of human language. For the first time, artificial language models (LMs) are able to match human performance in a number of compositional generalization tasks. However, much remains to be understood about the representational mechanisms underlying these abilities. We take a high-level geometric approach to this problem by relating the degree of compositionality in a dataset to the intrinsic dimensionality of its representations under an LM, a measure of feature complexity. We find not only that the degree of dataset compositionality is reflected in representations' intrinsic dimensionality, but that the relationship between compositionality and geometric complexity arises due to learned linguistic features over training. Finally, our analyses reveal a striking contrast between linear and nonlinear dimensionality, showing that they respectively encode formal and semantic aspects of linguistic composition.

2024-12-31

ACL (1) (published)

ICLR 2025 Workshop on Tackling Climate Change with Machine Learning: Data-Centric Approaches in ML for Climate Action

Konstantin Klemmer

Melissa Chapman

Lily Xu

Poon Kin Ho

Mélisande Teng

Patrick Emami

Climate change is one of the greatest problems society has ever faced, with increasingly severe consequences for humanity as natural disaste… (see more)rs multiply, sea levels rise, and ecosystems falter. While no silver bullet, machine learning can be an invaluable tool in fighting climate change via a wide array of applications and techniques, from designing smart electric grids to tracking greenhouse gas emissions through satellite imagery. These applications require algorithmic innovations in machine learning and close collaboration with diverse fields and practitioners. This workshop is intended as a forum for those in the global machine learning community who wish to help tackle climate change, and is further aimed to help foster cross-pollination between researchers in machine learning and experts in complementary climate-relevant fields. Building on our past workshops on this topic, this workshop particularly aims to explore data-centric ML approaches for climate action. Data-centric ML is not only a timely topic within the ICLR community, as analyzing and engineering (pre)training datasets becomes increasingly important, but holds specific challenges and opportunities in climate-related areas. We also want to take the opportunity of ICLR being hosted in Singapore to engage with local communities and shine a light on work that deploys, analyzes or critiques ML methods and their use for climate change adaptation and mitigation on the Asian continent.

2024-12-31

ICLR.cc/2025/Workshop_Proposals (published)

Integrating Generative and Experimental Platforms for Biomolecular Design

Cheng-Hao Liu

Jarrid Rector-Brooks

Soojung Yang

Sidney L Lisanza

Francesca-Zhoufan Li

Hannes Stärk

Jacob Gershon

Lauren Hong

Pranam Chatterjee

Tommi Jaakkola

Regina Barzilay

David Baker

Frances H. Arnold

Biomolecular design, through artificial engineering of proteins, ligands, and nucleic acids, holds immense promise in addressing pressing me… (see more)dical, industrial, and environmental challenges. While generative machine learning has shown significant potential in this area, a palpable disconnect exists with experimental biology: many ML research efforts prioritize static benchmark performance, potentially sidelining impactful biological applications. This workshop seeks to bridge this gap by bringing computationalists and experimentalists together, catalyzing a deeper interdisciplinary discourse. Together, we will explore the strengths and challenges of generative ML in biology, experimental integration of generative ML, and biological problems ready for ML. To attract high-quality and diverse research, we partnered with Nature Biotechnology for a special collection, and we created dedicated tracks for in-silico ML research and hybrid ML-experimental biology research. Our lineup features emerging leaders as speakers and renowned scientists as panelists, encapsulating a spectrum from high-throughput experimentation and computational biology to generative ML. With a diverse organizing team and backed by industry sponsors, we dedicate the workshop to pushing the boundaries of ML's role in biology.

2024-12-31

ICLR.cc/2025/Workshop_Proposals (published)

International AI Safety Report

Bronwyn Fox

André Carlos Ponce de Leon Ferreira de Carvalho

Mona Nemer

Raquel Pezoa Rivera

Yi Zeng

Juha Heikkilä

Guillaume Avrin

Antonio Krüger

Balaraman Ravindran

Hammam Riza

Ciarán Seoighe

Ziv Katzir

Andrea Monti

Hiroaki Kitano

Nusu Mwamanzi

Fahad Albalawi

José Ramón López Portillo

Haroon Sheikh

Gill Jolly … (see 86 more)

Olubunmi Ajala

Jerry Sheehan

Dominic Vincent Ligot

Kyoung Mu Lee

Crystal Rugege

Denise Wong

Nuria Oliver

Christian Busch

Ahmet Halit Hatip

Oleksii Molchanovskyi

Marwan Alserkal

Chris Johnson

Amandeep Singh Gill

Saif M. Khan

Sören Mindermann

Daniel Privitera

Tamay Besiroglu

Rishi Bommasani

Stephen Casper

Yejin Choi

Philip Fox

Ben Garfinkel

Danielle Goldfarb

Hoda Heidari

Anson Ho

Sayash Kapoor

Leila Khalatbari

Shayne Longpre

Sam Manning

Vasilios Mavroudis

Mantas Mazeika

Julian Michael

Jessica Newman

Kwan Yee Ng

Chinasa T. Okolo

Deborah Raji

Girish Sastry

Elizabeth Seger

Theodora Skeadas

Tobin South

Daron Acemoglu

Olubayo Adekanmbi

David Dalrymple

Thomas G. Dietterich

Edward W. Felten

Pascale Fung

Pierre-Olivier Gourinchas

Fredrik Heintz

Geoffrey Hinton

Nick Jennings

Andreas Krause

Susan Leavy

Percy Liang

Teresa Ludermir

Vidushi Marda

Emma Strubell

Florian Tramèr

Lucia Velasco

Nicole Wheeler

Helen Margetts

John McDermid

Jane Munga

Arvind Narayanan

Alondra Nelson

Clara Neppel

Alice Oh

Gopal Ramchurn

Stuart Russell

Marietje Schaake

Bernhard Schölkopf

Dawn Song

Alvaro Soto

Lee Tiedrich

Gael Varoquaux

Andrew Yao

Ya-Qin Zhang

Baran Acar

Ben Clifford

Lambrini Das

Claire Dennis

Freya Hempleman

Hannah Merchant

Rian Overy

Ben Snodin

Jonathan Barry

Benjamin Prud’homme

The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced… (see more) AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory Panel. A total of 100 AI experts contributed, representing diverse perspectives and disciplines. Led by the report's Chair, these independent experts collectively had full discretion over the report's content.

2024-12-31

arXiv (preprint)

Open Technical Problems in Open-Weight AI Model Risk Management

Stephen Casper

Kyle O'Brien

Shayne Longpre

Elizabeth Seger

Kevin Klyman

Rishi Bommasani

Aniruddha Nrusimha

Ilia Shumailov

Sören Mindermann

Steven Basart

Frank Rudzicz

Kellin Pelrine

Avijit Ghosh

Andrew Strait

Robert Kirk

Dan Hendrycks

Peter Henderson

J. Zico Kolter

Geoffrey Irving

Yarin Gal … (see 2 more)

Dylan Hadfield-Menell

2024-12-31

SSRN Electronic Journal (accepted)

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?

Michael Cohen

Joumana Ghosn

Adam Oberman

Jesse Richardson

Oliver Richardson

Marc-Antoine Rondeau

Pierre-Luc St-Charles

David Williams-King

The leading AI companies are increasingly focused on building generalist AI agents -- systems that can autonomously plan, act, and pursue go… (see more)als across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human control. We discuss how these risks arise from current AI training methods. Indeed, various scenarios and experiments have demonstrated the possibility of AI agents engaging in deception or pursuing goals that were not specified by human operators and that conflict with human interests, such as self-preservation. Following the precautionary principle, we see a strong need for safer, yet still useful, alternatives to the current agency-driven trajectory. Accordingly, we propose as a core building block for further advances the development of a non-agentic AI system that is trustworthy and safe by design, which we call Scientist AI. This system is designed to explain the world from observations, as opposed to taking actions in it to imitate or please humans. It comprises a world model that generates theories to explain data and a question-answering inference machine. Both components operate with an explicit notion of uncertainty to mitigate the risks of overconfident predictions. In light of these considerations, a Scientist AI could be used to assist human researchers in accelerating scientific progress, including in AI safety. In particular, our system can be employed as a guardrail against AI agents that might be created despite the risks involved. Ultimately, focusing on non-agentic AI may enable the benefits of AI innovation while avoiding the risks associated with the current trajectory. We hope these arguments will motivate researchers, developers, and policymakers to favor this safer path.

2024-12-31

arXiv (preprint)

The Singapore Consensus on Global AI Safety Research Priorities

Tegan Maharaj

Luke Ong

Stuart Russell

Dawn Song

Max Tegmark

Lan Xue

Ya-Qin Zhang

Stephen Casper

Wan Sie Lee

Sören Mindermann

Vanessa Wilfred

Vidhisha Balachandran

Fazl Barez

Michael Belinsky

Ima Bello

Malo Bourgon

Mark Brakel

Simeon Campos

Duncan Cass-Beggs … (see 67 more)

Jiahao Chen

Rumman Chowdhury

Chua Kuan Seah

Jeff Clune

Juntao Dai

Agnes Delaborde

Nouha Dziri

Francisco Eiras

Joshua Engels

Jinyu Fan

Adam Gleave

Noah Goodman

Fynn Heide

Johannes Heidecke

Dan Hendrycks

Cyrus Hodes

Bryan Low

Minlie Huang

Sami Jawhar

Jingyu Wang

Adam Kalai

Meindert Kamphuis

Mohan Kankanhalli

Subhash Kantamneni

Mathias Kirk Bonde

Thomas Kwa

Jeffrey Ladish

Kwok Yan Lam

Wan Sie Lee

Taewhi Lee

Xiaojian Li

Jiajun Liu

Chaochao Lu

Yifan Mai

Richard Mallah

Julian Michael

Nicolas Moës

Simon Moeller

Kihyuk Nam

Kwan Yee Ng

Mark Nitzberg

Besmira Nushi

Seán Ó hÉigeartaigh

Alejandro Ortega

Pierre Peigné

James Petrie

Benjamin Prud'homme

Reihaneh Rabbany

Nayat Sanchez-Pi

Sarah Schwettmann

Buck Shlegeris

SAAD SIDDIQUI

Anu Sinha

Martin Soto

Cheston Tan

Anthony Tung

William Tjhi

Robert Trager

Brian Tse

Anthony Tung

John Willes

Denise Wong

Wei Xu

Rongwu Xu

Yi Zeng

Hongjiang Zhang

Djordje Zikelic

Rapidly improving AI capabilities and autonomy hold significant promise of transformation, but are also driving vigorous debate on how to en… (see more)sure that AI is safe, i.e., trustworthy, reliable, and secure. Building a trusted ecosystem is therefore essential – it helps people embrace AI with confidence and gives maximal space for innovation while avoiding backlash. This requires policymakers, industry, researchers and the broader public to collectively work toward securing positive outcomes from AI’s development. AI safety research is a key dimension. Given that the state of science today for building trustworthy AI does not fully cover all risks, accelerated investment in research is required to keep pace with commercially driven growth in system capabilities. Goals: The 2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety aims to support research in this important space by bringing together AI scientists across geographies to identify and synthesise research priorities in AI safety. The result, The Singapore Consensus on Global AI Safety Research Priorities, builds on the International AI Safety Report-A (IAISR) chaired by Yoshua Bengio and backed by 33 governments. By adopting a defence-in-depth model, this document organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment), and challenges with monitoring and intervening after deployment (Control). Through the Singapore Consensus, we hope to globally facilitate meaningful conversations between AI scientists and AI policymakers for maximally beneficial outcomes. Our goal is to enable more impactful R&D efforts to rapidly develop safety and evaluation mechanisms and foster a trusted ecosystem where AI is harnessed for the public good.

2024-12-31

arXiv.org (preprint)

In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate?

BEN BUCKNALL

SAAD SIDDIQUI

LARA THURNHERR

CONOR MCGURK

BEN HARACK

Anka Reuel

PATRICIA PASKOV

CASEY MAHONEY

Sören Mindermann

Scott Singer

VINAY HIREMATH

Charbel-Raphael Segerie

OSCAR DELANEY

Alessandro Abate

Fazl Barez

Michael K. Cohen

Philip Torr

FERENC HUSZÁR

ANISOARA CALINESCU

GABRIEL DAVIS JONES … (see 2 more)

Robert Trager

International cooperation is common in AI research, including between geopolitical rivals. While many experts advocate for greater internati… (see more)onal cooperation on AI safety to address shared global risks, some view cooperation on AI with suspicion, arguing that it can pose unacceptable risks to national security. However, the extent to which cooperation on AI safety poses such risks, as well as provides benefits, depends on the specific area of cooperation. In this paper, we consider technical factors that impact the risks of international cooperation on AI safety research, focusing on the degree to which such cooperation can advance dangerous capabilities, result in the sharing of sensitive information, or provide opportunities for harm. We begin by why nations historically cooperate on strategic technologies and analyse current US-China cooperation in AI as a case study. We further argue that existing frameworks for managing associated risks can be supplemented with consideration of key risks specific to cooperation on technical AI safety research. Through our analysis, we find that research into AI verification mechanisms and shared protocols may be suitable areas for such cooperation. Through this analysis we aim to help researchers and governments identify and mitigate the risks of international cooperation on AI safety research, so that the benefits of cooperation can be fully realised.

2024-12-31

ACM Conference on Fairness, Accountability, and Transparency (published)