Yoshua Bengio

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Cassidy MacNeil, adjointe principale et responsable des opérations cassidy.macneil@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Collaborateur·rice alumni - McGill

Berkes Anaïs

Collaborateur·rice de recherche - Cambridge University

Superviseur⋅e principal⋅e :

Rim Assouel

Doctorat - UdeM

Stefan Bauer

Visiteur de recherche indépendant

Co-superviseur⋅e :

Guillaume Lajoie

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Collaborateur·rice de recherche - KAIST

Doctorat - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Loubna Benabbou

Desmond Elliott

Visiteur de recherche indépendant

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - UdeM

Doctorat

Doctorat - UdeM

Moksh Jain

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - UdeM

Hyeonah Kim

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Alex Hernandez-Garcia

Tabitha Edith Lee

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni

Collaborateur·rice alumni - UdeM

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Dhanya Sridhar

Sarthak Mittal

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Visiteur de recherche indépendant - UdeM

Padideh Nouri

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Ali Parviz

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Lena Podina

Collaborateur·rice de recherche - University of Waterloo

Superviseur⋅e principal⋅e :

David Rolnick

Camille Rochefort-Boulanger

Nassim Rahaman

Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems

Amine RAZIG

Collaborateur·rice de recherche - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Postdoctorat - UdeM

Postdoctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Julie Hussin

Dragos Secrieru

Collaborateur·rice alumni - UdeM

Divya Sharma

Postdoctorat

Co-superviseur⋅e :

Alex Hernandez-Garcia

Mélisande Astrid Crystal Teng

Vincent Taboga

Collaborateur·rice alumni - Polytechnique

Co-superviseur⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Hugo Larochelle

Ivan Titov

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

Siva Reddy

Alex Tong

Collaborateur·rice alumni - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche

Collaborateur·rice de recherche - UdeM

Doctorat - UdeM

Doctorat - McGill

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Aaron Courville

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Harry Zhao

Collaborateur·rice alumni - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

Scaling in the service of reasoning & model-based ML

4 avril 2023

par

Yoshua Bengio

Edward J. Hu

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

par

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

Les réseaux de flot génératifs

15 mars 2022

par

Yoshua Bengio

Publications

MAP: Low-Compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

Zhiqi Bu

Huan He

Yonghui Wu

Jiang Bian

Yong Chen

Model merging has emerged as an effective approach to combine multiple single-task models into a multitask model. This process typically inv… (voir plus)olves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the objectives of different tasks can lead to trade-offs during the merging process. In real-world applications, a set of solutions with various trade-offs can be more informative, helping practitioners make decisions based on diverse preferences. In this paper, we introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP). MAP efficiently identifies a Pareto set of scaling coefficients for merging multiple models, reflecting the trade-offs involved. It amortizes the substantial computational cost of evaluations needed to estimate the Pareto front by using quadratic approximation surrogate models derived from a pre-selected set of scaling coefficients. Experimental results on vision and natural language processing tasks demonstrate that MAP can accurately identify the Pareto front, providing practitioners with flexible solutions to balance competing task objectives. We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.

2025-01-21

ICLR.cc/2025/Conference (poster)

Structure Language Models for Protein Conformation Generation

Stephen Z. Lu

Hongyu Guo

Proteins adopt multiple structural conformations to perform their diverse biological functions, and understanding these conformations is cru… (voir plus)cial for advancing drug discovery. Traditional physics-based simulation methods often struggle with sampling equilibrium conformations and are computationally expensive. Recently, deep generative models have shown promise in generating protein conformations as a more efficient alternative. However, these methods predominantly rely on the diffusion process within a 3D geometric space, which typically centers around the vicinity of metastable states and is often inefficient in terms of runtime. In this paper, we introduce Structure Language Modeling (SLM) as a novel framework for efficient protein conformation generation. Specifically, the protein structures are first encoded into a compact latent space using a discrete variational auto-encoder, followed by conditional language modeling that effectively captures sequence-specific conformation distributions. This enables a more efficient and interpretable exploration of diverse ensemble modes compared to existing methods. Based on this general framework, we instantiate SLM with various popular LM architectures as well as proposing the ESMDiff, a novel BERT-like structure language model fine-tuned from ESM3 with masked diffusion. We verify our approach in various scenarios, including the equilibrium dynamics of BPTI, conformational change pairs, and intrinsically disordered proteins. SLM provides a highly efficient solution, offering a 20-100x speedup than existing methods in generating diverse conformations, shedding light on promising avenues for future research.

2025-01-21

ICLR.cc/2025/Conference (poster)

On the Transfer of Object-Centric Representation Learning.

Aniket Rajiv Didolkar

Andrii Zadaianchuk

Anirudh Goyal

Michael Curtis Mozer

Georg Martius

Maximilian Seitzer

2025-01-21

ICLR.cc/2025/Conference (poster)

Towards Improving Exploration Through Sibling Augmented GFlowNets

2025-01-21

ICLR.cc/2025/Conference (poster)

Geometric Signatures of Compositionality Across a Language Model's Lifetime

Jin Hwa Lee

Thomas Jiralerspong

Lei Yu

Emily Cheng

Compositionality, the notion that the meaning of an expression is constructed from the meaning of its parts and syntactic rules, permits the… (voir plus) infinite productivity of human language. For the first time, artificial language models (LMs) are able to match human performance in a number of compositional generalization tasks. However, much remains to be understood about the representational mechanisms underlying these abilities. We take a high-level geometric approach to this problem by relating the degree of compositionality in a dataset to the intrinsic dimensionality of its representations under an LM, a measure of feature complexity. We find not only that the degree of dataset compositionality is reflected in representations' intrinsic dimensionality, but that the relationship between compositionality and geometric complexity arises due to learned linguistic features over training. Finally, our analyses reveal a striking contrast between linear and nonlinear dimensionality, showing that they respectively encode formal and semantic aspects of linguistic composition.

2024-12-31

ACL (1) (publié)

ICLR 2025 Workshop on Tackling Climate Change with Machine Learning: Data-Centric Approaches in ML for Climate Action

Konstantin Klemmer

Melissa Chapman

Lily Xu

Poon Kin Ho

Mélisande Teng

Patrick Emami

Climate change is one of the greatest problems society has ever faced, with increasingly severe consequences for humanity as natural disaste… (voir plus)rs multiply, sea levels rise, and ecosystems falter. While no silver bullet, machine learning can be an invaluable tool in fighting climate change via a wide array of applications and techniques, from designing smart electric grids to tracking greenhouse gas emissions through satellite imagery. These applications require algorithmic innovations in machine learning and close collaboration with diverse fields and practitioners. This workshop is intended as a forum for those in the global machine learning community who wish to help tackle climate change, and is further aimed to help foster cross-pollination between researchers in machine learning and experts in complementary climate-relevant fields. Building on our past workshops on this topic, this workshop particularly aims to explore data-centric ML approaches for climate action. Data-centric ML is not only a timely topic within the ICLR community, as analyzing and engineering (pre)training datasets becomes increasingly important, but holds specific challenges and opportunities in climate-related areas. We also want to take the opportunity of ICLR being hosted in Singapore to engage with local communities and shine a light on work that deploys, analyzes or critiques ML methods and their use for climate change adaptation and mitigation on the Asian continent.

2024-12-31

ICLR.cc/2025/Workshop_Proposals (publié)

Integrating Generative and Experimental Platforms for Biomolecular Design

Cheng-Hao Liu

Jarrid Rector-Brooks

Soojung Yang

Sidney L Lisanza

Francesca-Zhoufan Li

Hannes Stärk

Jacob Gershon

Lauren Hong

Pranam Chatterjee

Tommi Jaakkola

Regina Barzilay

David Baker

Frances H. Arnold

Biomolecular design, through artificial engineering of proteins, ligands, and nucleic acids, holds immense promise in addressing pressing me… (voir plus)dical, industrial, and environmental challenges. While generative machine learning has shown significant potential in this area, a palpable disconnect exists with experimental biology: many ML research efforts prioritize static benchmark performance, potentially sidelining impactful biological applications. This workshop seeks to bridge this gap by bringing computationalists and experimentalists together, catalyzing a deeper interdisciplinary discourse. Together, we will explore the strengths and challenges of generative ML in biology, experimental integration of generative ML, and biological problems ready for ML. To attract high-quality and diverse research, we partnered with Nature Biotechnology for a special collection, and we created dedicated tracks for in-silico ML research and hybrid ML-experimental biology research. Our lineup features emerging leaders as speakers and renowned scientists as panelists, encapsulating a spectrum from high-throughput experimentation and computational biology to generative ML. With a diverse organizing team and backed by industry sponsors, we dedicate the workshop to pushing the boundaries of ML's role in biology.

2024-12-31

ICLR.cc/2025/Workshop_Proposals (publié)

International AI Safety Report

Bronwyn Fox

André Carlos Ponce de Leon Ferreira de Carvalho

Mona Nemer

Raquel Pezoa Rivera

Yi Zeng

Juha Heikkilä

Guillaume Avrin

Antonio Krüger

Balaraman Ravindran

Hammam Riza

Ciarán Seoighe

Ziv Katzir

Andrea Monti

Hiroaki Kitano

Nusu Mwamanzi

Fahad Albalawi

José Ramón López Portillo

Haroon Sheikh

Gill Jolly … (voir 86 de plus)

Olubunmi Ajala

Jerry Sheehan

Dominic Vincent Ligot

Kyoung Mu Lee

Crystal Rugege

Denise Wong

Nuria Oliver

Christian Busch

Ahmet Halit Hatip

Oleksii Molchanovskyi

Marwan Alserkal

Chris Johnson

Amandeep Singh Gill

Saif M. Khan

Sören Mindermann

Daniel Privitera

Tamay Besiroglu

Rishi Bommasani

Stephen Casper

Yejin Choi

Philip Fox

Ben Garfinkel

Danielle Goldfarb

Hoda Heidari

Anson Ho

Sayash Kapoor

Leila Khalatbari

Shayne Longpre

Sam Manning

Vasilios Mavroudis

Mantas Mazeika

Julian Michael

Jessica Newman

Kwan Yee Ng

Chinasa T. Okolo

Deborah Raji

Girish Sastry

Elizabeth Seger

Theodora Skeadas

Tobin South

Daron Acemoglu

Olubayo Adekanmbi

David Dalrymple

Thomas G. Dietterich

Edward W. Felten

Pascale Fung

Pierre-Olivier Gourinchas

Fredrik Heintz

Geoffrey Hinton

Nick Jennings

Andreas Krause

Susan Leavy

Percy Liang

Teresa Ludermir

Vidushi Marda

Emma Strubell

Florian Tramèr

Lucia Velasco

Nicole Wheeler

Helen Margetts

John McDermid

Jane Munga

Arvind Narayanan

Alondra Nelson

Clara Neppel

Alice Oh

Gopal Ramchurn

Stuart Russell

Marietje Schaake

Bernhard Schölkopf

Dawn Song

Alvaro Soto

Lee Tiedrich

Gael Varoquaux

Andrew Yao

Ya-Qin Zhang

Baran Acar

Ben Clifford

Lambrini Das

Claire Dennis

Freya Hempleman

Hannah Merchant

Rian Overy

Ben Snodin

Jonathan Barry

Benjamin Prud’homme

The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced… (voir plus) AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory Panel. A total of 100 AI experts contributed, representing diverse perspectives and disciplines. Led by the report's Chair, these independent experts collectively had full discretion over the report's content.

2024-12-31

arXiv (prépublication)

Open Technical Problems in Open-Weight AI Model Risk Management

Stephen Casper

Kyle O'Brien

Shayne Longpre

Elizabeth Seger

Kevin Klyman

Rishi Bommasani

Aniruddha Nrusimha

Ilia Shumailov

Sören Mindermann

Steven Basart

Frank Rudzicz

Kellin Pelrine

Avijit Ghosh

Andrew Strait

Robert Kirk

Dan Hendrycks

Peter Henderson

J. Zico Kolter

Geoffrey Irving

Yarin Gal … (voir 2 de plus)

Dylan Hadfield-Menell

2024-12-31

SSRN Electronic Journal (accepté)

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?

Michael Cohen

Joumana Ghosn

Adam Oberman

Jesse Richardson

Oliver Richardson

Marc-Antoine Rondeau

Pierre-Luc St-Charles

David Williams-King

The leading AI companies are increasingly focused on building generalist AI agents -- systems that can autonomously plan, act, and pursue go… (voir plus)als across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human control. We discuss how these risks arise from current AI training methods. Indeed, various scenarios and experiments have demonstrated the possibility of AI agents engaging in deception or pursuing goals that were not specified by human operators and that conflict with human interests, such as self-preservation. Following the precautionary principle, we see a strong need for safer, yet still useful, alternatives to the current agency-driven trajectory. Accordingly, we propose as a core building block for further advances the development of a non-agentic AI system that is trustworthy and safe by design, which we call Scientist AI. This system is designed to explain the world from observations, as opposed to taking actions in it to imitate or please humans. It comprises a world model that generates theories to explain data and a question-answering inference machine. Both components operate with an explicit notion of uncertainty to mitigate the risks of overconfident predictions. In light of these considerations, a Scientist AI could be used to assist human researchers in accelerating scientific progress, including in AI safety. In particular, our system can be employed as a guardrail against AI agents that might be created despite the risks involved. Ultimately, focusing on non-agentic AI may enable the benefits of AI innovation while avoiding the risks associated with the current trajectory. We hope these arguments will motivate researchers, developers, and policymakers to favor this safer path.

2024-12-31

arXiv (prépublication)

The Singapore Consensus on Global AI Safety Research Priorities

Tegan Maharaj

Luke Ong

Stuart Russell

Dawn Song

Max Tegmark

Lan Xue

Ya-Qin Zhang

Stephen Casper

Wan Sie Lee

Sören Mindermann

Vanessa Wilfred

Vidhisha Balachandran

Fazl Barez

Michael Belinsky

Ima Bello

Malo Bourgon

Mark Brakel

Simeon Campos

Duncan Cass-Beggs … (voir 67 de plus)

Jiahao Chen

Rumman Chowdhury

Chua Kuan Seah

Jeff Clune

Juntao Dai

Agnes Delaborde

Nouha Dziri

Francisco Eiras

Joshua Engels

Jinyu Fan

Adam Gleave

Noah Goodman

Fynn Heide

Johannes Heidecke

Dan Hendrycks

Cyrus Hodes

Bryan Low

Minlie Huang

Sami Jawhar

Jingyu Wang

Adam Kalai

Meindert Kamphuis

Mohan Kankanhalli

Subhash Kantamneni

Mathias Kirk Bonde

Thomas Kwa

Jeffrey Ladish

Kwok Yan Lam

Wan Sie Lee

Taewhi Lee

Xiaojian Li

Jiajun Liu

Chaochao Lu

Yifan Mai

Richard Mallah

Julian Michael

Nicolas Moës

Simon Moeller

Kihyuk Nam

Kwan Yee Ng

Mark Nitzberg

Besmira Nushi

Seán Ó hÉigeartaigh

Alejandro Ortega

Pierre Peigné

James Petrie

Benjamin Prud'homme

Reihaneh Rabbany

Nayat Sanchez-Pi

Sarah Schwettmann

Buck Shlegeris

SAAD SIDDIQUI

Anu Sinha

Martin Soto

Cheston Tan

Anthony Tung

William Tjhi

Robert Trager

Brian Tse

Anthony Tung

John Willes

Denise Wong

Wei Xu

Rongwu Xu

Yi Zeng

Hongjiang Zhang

Djordje Zikelic

Rapidly improving AI capabilities and autonomy hold significant promise of transformation, but are also driving vigorous debate on how to en… (voir plus)sure that AI is safe, i.e., trustworthy, reliable, and secure. Building a trusted ecosystem is therefore essential – it helps people embrace AI with confidence and gives maximal space for innovation while avoiding backlash. This requires policymakers, industry, researchers and the broader public to collectively work toward securing positive outcomes from AI’s development. AI safety research is a key dimension. Given that the state of science today for building trustworthy AI does not fully cover all risks, accelerated investment in research is required to keep pace with commercially driven growth in system capabilities. Goals: The 2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety aims to support research in this important space by bringing together AI scientists across geographies to identify and synthesise research priorities in AI safety. The result, The Singapore Consensus on Global AI Safety Research Priorities, builds on the International AI Safety Report-A (IAISR) chaired by Yoshua Bengio and backed by 33 governments. By adopting a defence-in-depth model, this document organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment), and challenges with monitoring and intervening after deployment (Control). Through the Singapore Consensus, we hope to globally facilitate meaningful conversations between AI scientists and AI policymakers for maximally beneficial outcomes. Our goal is to enable more impactful R&D efforts to rapidly develop safety and evaluation mechanisms and foster a trusted ecosystem where AI is harnessed for the public good.

2024-12-31

arXiv.org (prépublication)

In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate?

BEN BUCKNALL

SAAD SIDDIQUI

LARA THURNHERR

CONOR MCGURK

BEN HARACK

Anka Reuel

PATRICIA PASKOV

CASEY MAHONEY

Sören Mindermann

Scott Singer

VINAY HIREMATH

Charbel-Raphael Segerie

OSCAR DELANEY

Alessandro Abate

Fazl Barez

Michael K. Cohen

Philip Torr

FERENC HUSZÁR

ANISOARA CALINESCU

GABRIEL DAVIS JONES … (voir 2 de plus)

Robert Trager

International cooperation is common in AI research, including between geopolitical rivals. While many experts advocate for greater internati… (voir plus)onal cooperation on AI safety to address shared global risks, some view cooperation on AI with suspicion, arguing that it can pose unacceptable risks to national security. However, the extent to which cooperation on AI safety poses such risks, as well as provides benefits, depends on the specific area of cooperation. In this paper, we consider technical factors that impact the risks of international cooperation on AI safety research, focusing on the degree to which such cooperation can advance dangerous capabilities, result in the sharing of sensitive information, or provide opportunities for harm. We begin by why nations historically cooperate on strategic technologies and analyse current US-China cooperation in AI as a case study. We further argue that existing frameworks for managing associated risks can be supplemented with consideration of key risks specific to cooperation on technical AI safety research. Through our analysis, we find that research into AI verification mechanisms and shared protocols may be suitable areas for such cooperation. Through this analysis we aim to help researchers and governments identify and mitigate the risks of international cooperation on AI safety research, so that the benefits of cooperation can be fully realised.

2024-12-31

ACM Conference on Fairness, Accountability, and Transparency (publié)