Portrait de Yoshua Bengio

Yoshua Bengio

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur titulaire, Université de Montréal, Département d'informatique et de recherche opérationnelle
Fondateur et Conseiller scientifique, Équipe de direction
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage par renforcement
Apprentissage profond
Causalité
Modèles génératifs
Modèles probabilistes
Modélisation moléculaire
Neurosciences computationnelles
Raisonnement
Réseaux de neurones en graphes
Réseaux de neurones récurrents
Théorie de l'apprentissage automatique
Traitement du langage naturel

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Cassidy MacNeil, adjointe principale et responsable des opérations cassidy.macneil@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Publications

MAP: Low-Compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
Zhiqi Bu
Huan He
Yonghui Wu
Jiang Bian
Yong Chen
Model merging has emerged as an effective approach to combine multiple single-task models into a multitask model. This process typically inv… (voir plus)olves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the objectives of different tasks can lead to trade-offs during the merging process. In real-world applications, a set of solutions with various trade-offs can be more informative, helping practitioners make decisions based on diverse preferences. In this paper, we introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP). MAP efficiently identifies a Pareto set of scaling coefficients for merging multiple models, reflecting the trade-offs involved. It amortizes the substantial computational cost of evaluations needed to estimate the Pareto front by using quadratic approximation surrogate models derived from a pre-selected set of scaling coefficients. Experimental results on vision and natural language processing tasks demonstrate that MAP can accurately identify the Pareto front, providing practitioners with flexible solutions to balance competing task objectives. We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.
Structure Language Models for Protein Conformation Generation
Proteins adopt multiple structural conformations to perform their diverse biological functions, and understanding these conformations is cru… (voir plus)cial for advancing drug discovery. Traditional physics-based simulation methods often struggle with sampling equilibrium conformations and are computationally expensive. Recently, deep generative models have shown promise in generating protein conformations as a more efficient alternative. However, these methods predominantly rely on the diffusion process within a 3D geometric space, which typically centers around the vicinity of metastable states and is often inefficient in terms of runtime. In this paper, we introduce Structure Language Modeling (SLM) as a novel framework for efficient protein conformation generation. Specifically, the protein structures are first encoded into a compact latent space using a discrete variational auto-encoder, followed by conditional language modeling that effectively captures sequence-specific conformation distributions. This enables a more efficient and interpretable exploration of diverse ensemble modes compared to existing methods. Based on this general framework, we instantiate SLM with various popular LM architectures as well as proposing the ESMDiff, a novel BERT-like structure language model fine-tuned from ESM3 with masked diffusion. We verify our approach in various scenarios, including the equilibrium dynamics of BPTI, conformational change pairs, and intrinsically disordered proteins. SLM provides a highly efficient solution, offering a 20-100x speedup than existing methods in generating diverse conformations, shedding light on promising avenues for future research.
On the Transfer of Object-Centric Representation Learning.
Aniket Rajiv Didolkar
Andrii Zadaianchuk
Michael Curtis Mozer
Georg Martius
Maximilian Seitzer
Towards Improving Exploration Through Sibling Augmented GFlowNets
Geometric Signatures of Compositionality Across a Language Model's Lifetime
Jin Hwa Lee
Lei Yu
Emily Cheng
Compositionality, the notion that the meaning of an expression is constructed from the meaning of its parts and syntactic rules, permits the… (voir plus) infinite productivity of human language. For the first time, artificial language models (LMs) are able to match human performance in a number of compositional generalization tasks. However, much remains to be understood about the representational mechanisms underlying these abilities. We take a high-level geometric approach to this problem by relating the degree of compositionality in a dataset to the intrinsic dimensionality of its representations under an LM, a measure of feature complexity. We find not only that the degree of dataset compositionality is reflected in representations' intrinsic dimensionality, but that the relationship between compositionality and geometric complexity arises due to learned linguistic features over training. Finally, our analyses reveal a striking contrast between linear and nonlinear dimensionality, showing that they respectively encode formal and semantic aspects of linguistic composition.
ICLR 2025 Workshop on Tackling Climate Change with Machine Learning: Data-Centric Approaches in ML for Climate Action
Konstantin Klemmer
Melissa Chapman
Lily Xu
Poon Kin Ho
Mélisande Teng
Patrick Emami
Climate change is one of the greatest problems society has ever faced, with increasingly severe consequences for humanity as natural disaste… (voir plus)rs multiply, sea levels rise, and ecosystems falter. While no silver bullet, machine learning can be an invaluable tool in fighting climate change via a wide array of applications and techniques, from designing smart electric grids to tracking greenhouse gas emissions through satellite imagery. These applications require algorithmic innovations in machine learning and close collaboration with diverse fields and practitioners. This workshop is intended as a forum for those in the global machine learning community who wish to help tackle climate change, and is further aimed to help foster cross-pollination between researchers in machine learning and experts in complementary climate-relevant fields. Building on our past workshops on this topic, this workshop particularly aims to explore data-centric ML approaches for climate action. Data-centric ML is not only a timely topic within the ICLR community, as analyzing and engineering (pre)training datasets becomes increasingly important, but holds specific challenges and opportunities in climate-related areas. We also want to take the opportunity of ICLR being hosted in Singapore to engage with local communities and shine a light on work that deploys, analyzes or critiques ML methods and their use for climate change adaptation and mitigation on the Asian continent.
Integrating Generative and Experimental Platforms for Biomolecular Design
Cheng-Hao Liu
Soojung Yang
Sidney L Lisanza
Francesca-Zhoufan Li
Hannes Stärk
Jacob Gershon
Lauren Hong
Pranam Chatterjee
Tommi Jaakkola
Regina Barzilay
David Baker
Frances H. Arnold
Biomolecular design, through artificial engineering of proteins, ligands, and nucleic acids, holds immense promise in addressing pressing me… (voir plus)dical, industrial, and environmental challenges. While generative machine learning has shown significant potential in this area, a palpable disconnect exists with experimental biology: many ML research efforts prioritize static benchmark performance, potentially sidelining impactful biological applications. This workshop seeks to bridge this gap by bringing computationalists and experimentalists together, catalyzing a deeper interdisciplinary discourse. Together, we will explore the strengths and challenges of generative ML in biology, experimental integration of generative ML, and biological problems ready for ML. To attract high-quality and diverse research, we partnered with Nature Biotechnology for a special collection, and we created dedicated tracks for in-silico ML research and hybrid ML-experimental biology research. Our lineup features emerging leaders as speakers and renowned scientists as panelists, encapsulating a spectrum from high-throughput experimentation and computational biology to generative ML. With a diverse organizing team and backed by industry sponsors, we dedicate the workshop to pushing the boundaries of ML's role in biology.
International AI Safety Report
Bronwyn Fox
André Carlos Ponce de Leon Ferreira de Carvalho
Mona Nemer
Raquel Pezoa Rivera
Yi Zeng
Juha Heikkilä
Guillaume Avrin
Antonio Krüger
Balaraman Ravindran
Hammam Riza
Ciarán Seoighe
Ziv Katzir
Andrea Monti
Hiroaki Kitano
Nusu Mwamanzi
Fahad Albalawi
José Ramón López Portillo
Haroon Sheikh
Gill Jolly … (voir 86 de plus)
Olubunmi Ajala
Jerry Sheehan
Dominic Vincent Ligot
Kyoung Mu Lee
Crystal Rugege
Denise Wong
Nuria Oliver
Christian Busch
Ahmet Halit Hatip
Oleksii Molchanovskyi
Marwan Alserkal
Chris Johnson
Amandeep Singh Gill
Saif M. Khan
Daniel Privitera
Tamay Besiroglu
Rishi Bommasani
Stephen Casper
Yejin Choi
Philip Fox
Ben Garfinkel
Danielle Goldfarb
Hoda Heidari
Anson Ho
Sayash Kapoor
Leila Khalatbari
Shayne Longpre
Sam Manning
Vasilios Mavroudis
Mantas Mazeika
Julian Michael
Jessica Newman
Kwan Yee Ng
Chinasa T. Okolo
Deborah Raji
Girish Sastry
Elizabeth Seger
Theodora Skeadas
Tobin South
Daron Acemoglu
Olubayo Adekanmbi
David Dalrymple
Thomas G. Dietterich
Edward W. Felten
Pascale Fung
Pierre-Olivier Gourinchas
Fredrik Heintz
Geoffrey Hinton
Nick Jennings
Andreas Krause
Susan Leavy
Percy Liang
Teresa Ludermir
Vidushi Marda
Emma Strubell
Florian Tramèr
Lucia Velasco
Nicole Wheeler
Helen Margetts
John McDermid
Jane Munga
Arvind Narayanan
Alondra Nelson
Clara Neppel
Alice Oh
Gopal Ramchurn
Stuart Russell
Marietje Schaake
Bernhard Schölkopf
Dawn Song
Alvaro Soto
Lee Tiedrich
Andrew Yao
Ya-Qin Zhang
Baran Acar
Ben Clifford
Lambrini Das
Claire Dennis
Freya Hempleman
Hannah Merchant
Rian Overy
Ben Snodin
Benjamin Prud’homme
The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced… (voir plus) AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory Panel. A total of 100 AI experts contributed, representing diverse perspectives and disciplines. Led by the report's Chair, these independent experts collectively had full discretion over the report's content.
Open Technical Problems in Open-Weight AI Model Risk Management
Stephen Casper
Kyle O'Brien
Shayne Longpre
Elizabeth Seger
Kevin Klyman
Rishi Bommasani
Aniruddha Nrusimha
Ilia Shumailov
Sören Mindermann
Steven Basart
Frank Rudzicz
Avijit Ghosh
Andrew Strait
Robert Kirk
Dan Hendrycks
J. Zico Kolter
Geoffrey Irving
Yarin Gal … (voir 2 de plus)
Dylan Hadfield-Menell
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?
Michael Cohen
Joumana Ghosn
Adam Oberman
Jesse Richardson
Oliver Richardson
Marc-Antoine Rondeau
Pierre-Luc St-Charles
David Williams-King
The leading AI companies are increasingly focused on building generalist AI agents -- systems that can autonomously plan, act, and pursue go… (voir plus)als across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human control. We discuss how these risks arise from current AI training methods. Indeed, various scenarios and experiments have demonstrated the possibility of AI agents engaging in deception or pursuing goals that were not specified by human operators and that conflict with human interests, such as self-preservation. Following the precautionary principle, we see a strong need for safer, yet still useful, alternatives to the current agency-driven trajectory. Accordingly, we propose as a core building block for further advances the development of a non-agentic AI system that is trustworthy and safe by design, which we call Scientist AI. This system is designed to explain the world from observations, as opposed to taking actions in it to imitate or please humans. It comprises a world model that generates theories to explain data and a question-answering inference machine. Both components operate with an explicit notion of uncertainty to mitigate the risks of overconfident predictions. In light of these considerations, a Scientist AI could be used to assist human researchers in accelerating scientific progress, including in AI safety. In particular, our system can be employed as a guardrail against AI agents that might be created despite the risks involved. Ultimately, focusing on non-agentic AI may enable the benefits of AI innovation while avoiding the risks associated with the current trajectory. We hope these arguments will motivate researchers, developers, and policymakers to favor this safer path.
The Singapore Consensus on Global AI Safety Research Priorities
Luke Ong
Stuart Russell
Dawn Song
Max Tegmark
Lan Xue
Ya-Qin Zhang
Stephen Casper
Wan Sie Lee
Vanessa Wilfred
Vidhisha Balachandran
Fazl Barez
Michael Belinsky
Ima Bello
Malo Bourgon
Mark Brakel
Simeon Campos
Duncan Cass-Beggs … (voir 67 de plus)
Jiahao Chen
Rumman Chowdhury
Chua Kuan Seah
Jeff Clune
Juntao Dai
Agnes Delaborde
Francisco Eiras
Joshua Engels
Jinyu Fan
Adam Gleave
Noah Goodman
Fynn Heide
Johannes Heidecke
Dan Hendrycks
Cyrus Hodes
Bryan Low
Minlie Huang
Sami Jawhar
Jingyu Wang
Adam Kalai
Meindert Kamphuis
Mohan Kankanhalli
Subhash Kantamneni
Mathias Kirk Bonde
Thomas Kwa
Jeffrey Ladish
Kwok Yan Lam
Wan Sie Lee
Taewhi Lee
Xiaojian Li
Jiajun Liu
Chaochao Lu
Yifan Mai
Richard Mallah
Julian Michael
Nicolas Moës
Simon Moeller
Kihyuk Nam
Kwan Yee Ng
Mark Nitzberg
Besmira Nushi
Seán Ó hÉigeartaigh
Alejandro Ortega
Pierre Peigné
James Petrie
Nayat Sanchez-Pi
Sarah Schwettmann
Buck Shlegeris
SAAD SIDDIQUI
Anu Sinha
Martin Soto
Cheston Tan
Anthony Tung
William Tjhi
Robert Trager
Brian Tse
Anthony Tung
John Willes
Denise Wong
Wei Xu
Rongwu Xu
Yi Zeng
Hongjiang Zhang
Djordje Zikelic
Rapidly improving AI capabilities and autonomy hold significant promise of transformation, but are also driving vigorous debate on how to en… (voir plus)sure that AI is safe, i.e., trustworthy, reliable, and secure. Building a trusted ecosystem is therefore essential – it helps people embrace AI with confidence and gives maximal space for innovation while avoiding backlash. This requires policymakers, industry, researchers and the broader public to collectively work toward securing positive outcomes from AI’s development. AI safety research is a key dimension. Given that the state of science today for building trustworthy AI does not fully cover all risks, accelerated investment in research is required to keep pace with commercially driven growth in system capabilities. Goals: The 2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety aims to support research in this important space by bringing together AI scientists across geographies to identify and synthesise research priorities in AI safety. The result, The Singapore Consensus on Global AI Safety Research Priorities, builds on the International AI Safety Report-A (IAISR) chaired by Yoshua Bengio and backed by 33 governments. By adopting a defence-in-depth model, this document organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment), and challenges with monitoring and intervening after deployment (Control). Through the Singapore Consensus, we hope to globally facilitate meaningful conversations between AI scientists and AI policymakers for maximally beneficial outcomes. Our goal is to enable more impactful R&D efforts to rapidly develop safety and evaluation mechanisms and foster a trusted ecosystem where AI is harnessed for the public good.
In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate?
BEN BUCKNALL
SAAD SIDDIQUI
LARA THURNHERR
CONOR MCGURK
BEN HARACK
Anka Reuel
PATRICIA PASKOV
CASEY MAHONEY
Scott Singer
VINAY HIREMATH
Charbel-Raphael Segerie
OSCAR DELANEY
Alessandro Abate
Fazl Barez
Michael K. Cohen
Philip Torr
FERENC HUSZÁR
ANISOARA CALINESCU
GABRIEL DAVIS JONES … (voir 2 de plus)
Robert Trager
International cooperation is common in AI research, including between geopolitical rivals. While many experts advocate for greater internati… (voir plus)onal cooperation on AI safety to address shared global risks, some view cooperation on AI with suspicion, arguing that it can pose unacceptable risks to national security. However, the extent to which cooperation on AI safety poses such risks, as well as provides benefits, depends on the specific area of cooperation. In this paper, we consider technical factors that impact the risks of international cooperation on AI safety research, focusing on the degree to which such cooperation can advance dangerous capabilities, result in the sharing of sensitive information, or provide opportunities for harm. We begin by why nations historically cooperate on strategic technologies and analyse current US-China cooperation in AI as a case study. We further argue that existing frameworks for managing associated risks can be supplemented with consideration of key risks specific to cooperation on technical AI safety research. Through our analysis, we find that research into AI verification mechanisms and shared protocols may be suitable areas for such cooperation. Through this analysis we aim to help researchers and governments identify and mitigate the risks of international cooperation on AI safety research, so that the benefits of cooperation can be fully realised.