Portrait of Yoshua Bengio

Yoshua Bengio

Core Academic Member
Canada CIFAR AI Chair
Full Professor, Université de Montréal, Department of Computer Science and Operations Research Department
Founder and Scientific Advisor, Leadership Team
Research Topics
Causality
Computational Neuroscience
Deep Learning
Generative Models
Graph Neural Networks
Machine Learning Theory
Medical Machine Learning
Molecular Modeling
Natural Language Processing
Probabilistic Models
Reasoning
Recurrent Neural Networks
Reinforcement Learning
Representation Learning

Biography

*For media requests, please write to medias@mila.quebec.

For more information please contact Marie-Josée Beauchamp, Administrative Assistant at marie-josee.beauchamp@mila.quebec.

Yoshua Bengio is recognized worldwide as a leading expert in AI. He is most known for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award, “the Nobel Prize of computing,” with Geoffrey Hinton and Yann LeCun.

Bengio is a full professor at Université de Montréal, and the founder and scientific advisor of Mila – Quebec Artificial Intelligence Institute. He is also a senior fellow at CIFAR and co-directs its Learning in Machines & Brains program, serves as special advisor and founding scientific director of IVADO, and holds a Canada CIFAR AI Chair.

In 2019, Bengio was awarded the prestigious Killam Prize and in 2022, he was the most cited computer scientist in the world by h-index. He is a Fellow of the Royal Society of London, Fellow of the Royal Society of Canada, Knight of the Legion of Honor of France and Officer of the Order of Canada. In 2023, he was appointed to the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Concerned about the social impact of AI, Bengio helped draft the Montréal Declaration for the Responsible Development of Artificial Intelligence and continues to raise awareness about the importance of mitigating the potentially catastrophic risks associated with future AI systems.

Current Students

Collaborating Alumni - McGill University
Collaborating Alumni - Université de Montréal
Collaborating researcher - Cambridge University
Principal supervisor :
PhD - Université de Montréal
Collaborating Alumni - Université du Québec à Rimouski
Independent visiting researcher
Co-supervisor :
PhD - Université de Montréal
Collaborating Alumni - UQAR
Collaborating researcher - N/A
Principal supervisor :
PhD - Université de Montréal
Collaborating researcher - KAIST
PhD - Université de Montréal
PhD - Université de Montréal
Collaborating Alumni - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Research Intern - Université de Montréal
Research Intern - Université de Montréal
PhD - Université de Montréal
Master's Research - Université de Montréal
Co-supervisor :
Collaborating Alumni - Université de Montréal
Research Intern - Université de Montréal
Postdoctorate - Université de Montréal
Principal supervisor :
Collaborating researcher - Université de Montréal
Collaborating Alumni - Université de Montréal
Collaborating Alumni - Université de Montréal
Collaborating Alumni - Université de Montréal
Postdoctorate - Université de Montréal
Principal supervisor :
Collaborating Alumni
Collaborating Alumni - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Collaborating Alumni - Université de Montréal
Collaborating Alumni - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
Collaborating researcher - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
Postdoctorate - Université de Montréal
Principal supervisor :
Independent visiting researcher - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher - Ying Wu Coll of Computing
PhD - University of Waterloo
Principal supervisor :
Collaborating Alumni - Max-Planck-Institute for Intelligent Systems
PhD - Université de Montréal
Postdoctorate - Université de Montréal
Independent visiting researcher - Université de Montréal
Postdoctorate - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
Collaborating Alumni - Université de Montréal
Postdoctorate - Université de Montréal
Master's Research - Université de Montréal
Collaborating Alumni - Université de Montréal
Research Intern - Université de Montréal
Master's Research - Université de Montréal
Postdoctorate
Independent visiting researcher - Technical University of Munich
PhD - Université de Montréal
Co-supervisor :
Collaborating researcher - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)
Principal supervisor :
Postdoctorate - Université de Montréal
Postdoctorate - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher - Université de Montréal
Collaborating Alumni - Université de Montréal
Collaborating researcher
Collaborating researcher - KAIST
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
PhD - McGill University
Principal supervisor :
PhD - McGill University
Principal supervisor :

Publications

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (see more)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Scientific discovery in the age of artificial intelligence
Hanchen Wang
Tianfan Fu
Yuanqi Du
Wenhao Gao
Kexin Huang
Ziming Liu
Payal Chandak
Shengchao Liu
Peter Van Katwyk
Andreea Deac
Animashree Anandkumar
K. Bergen
Carla P. Gomes
Shirley Ho
Pushmeet Kohli
Joan Lasenby
Jure Leskovec
Tie-Yan Liu
A. Manrai
Debora Susan Marks … (see 10 more)
Bharath Ramsundar
Le Song
Jimeng Sun
Petar Veličković
Max Welling
Linfeng Zhang
Connor Wilson. Coley
Marinka Žitnik
What if We Enrich day-ahead Solar Irradiance Time Series Forecasting with Spatio-Temporal Context?
Oussama Boussif
Ghait Boukachab
Dan Assouline
Stefano Massaroli
Tianle Yuan
The global integration of solar power into the electrical grid could have a crucial impact on climate change mitigation, yet poses a challen… (see more)ge due to solar irradiance variability. We present a deep learning architecture which uses spatio-temporal context from satellite data for highly accurate day-ahead time-series forecasting, in particular Global Horizontal Irradiance (GHI). We provide a multi-quantile variant which outputs a prediction interval for each time-step, serving as a measure of forecasting uncertainty. In addition, we suggest a testing scheme that separates easy and difficult scenarios, which appears useful to evaluate model performance in varying cloud conditions. Our approach exhibits robust performance in solar irradiance forecasting, including zero-shot generalization tests at unobserved solar stations, and holds great promise in promoting the effective use of solar power and the resulting reduction of CO
What if We Enrich day-ahead Solar Irradiance Time Series Forecasting with Spatio-Temporal Context?
Oussama Boussif
Ghait Boukachab
Dan Assouline
Stefano Massaroli
Tianle Yuan
Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation
Chris Emezue
Tristan Deleu
Stefan Bauer
The practical utility of causality in decision-making is widespread and brought about by the intertwining of causal discovery and causal inf… (see more)erence. Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. To address this gap, we evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets, on the downstream task of treatment effect estimation. Through the implementation of a distribution-level evaluation, we offer valuable and unique insights into the efficacy of these causal discovery methods for treatment effect estimation, considering both synthetic and real-world scenarios, as well as low-data scenarios. The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes, while some tend to learn many low-probability modes which impacts the (unrelaxed) recall and precision.
AI For Global Climate Cooperation 2023 Competition Proceedings
Prateek Arun Gupta
Lu Li
Soham R. Phade
Sunil Srinivasa
andrew williams
Tianyu Zhang
Yangtian Zhang
Stephan Tao Zheng
The international community must collaborate to mitigate climate change and sustain economic growth. However, collaboration is hard to achie… (see more)ve, partly because no global authority can ensure compliance with international climate agreements. Combining AI with climate-economic simulations offers a promising solution to design international frameworks, including negotiation protocols and climate agreements, that promote and incentivize collaboration. In addition, these frameworks should also have policy goals fulfillment, and sustained commitment, taking into account climate-economic dynamics and strategic behaviors. These challenges require an interdisciplinary approach across machine learning, economics, climate science, law, policy, ethics, and other fields. Towards this objective, we organized AI for Global Climate Cooperation, a Mila competition in which teams submitted proposals and analyses of international frameworks, based on (modifications of) RICE-N, an AI-driven integrated assessment model (IAM). In particular, RICE-N supports modeling regional decision-making using AI agents. Furthermore, the IAM then models the climate-economic impact of those decisions into the future. Whereas the first track focused only on performance metrics, the proposals submitted to the second track were evaluated both quantitatively and qualitatively. The quantitative evaluation focused on a combination of (i) the degree of mitigation of global temperature rise and (ii) the increase in economic productivity. On the other hand, an interdisciplinary panel of human experts in law, policy, sociology, economics and environmental science, evaluated the solutions qualitatively. In particular, the panel considered the effectiveness, simplicity, feasibility, ethics, and notions of climate justice of the protocols. In the third track, the participants were asked to critique and improve RICE-N.
AI For Global Climate Cooperation 2023 Competition Proceedings
Prateek Arun Gupta
Lu Li
Soham R. Phade
Sunil Srinivasa
andrew williams
Tianyu Zhang
Yang Zhang
Stephan Tao Zheng
The international community must collaborate to mitigate climate change and sustain economic growth. However, collaboration is hard to achie… (see more)ve, partly because no global authority can ensure compliance with international climate agreements. Combining AI with climate-economic simulations offers a promising solution to design international frameworks, including negotiation protocols and climate agreements, that promote and incentivize collaboration. In addition, these frameworks should also have policy goals fulfillment, and sustained commitment, taking into account climate-economic dynamics and strategic behaviors. These challenges require an interdisciplinary approach across machine learning, economics, climate science, law, policy, ethics, and other fields. Towards this objective, we organized AI for Global Climate Cooperation, a Mila competition in which teams submitted proposals and analyses of international frameworks, based on (modifications of) RICE-N, an AI-driven integrated assessment model (IAM). In particular, RICE-N supports modeling regional decision-making using AI agents. Furthermore, the IAM then models the climate-economic impact of those decisions into the future. Whereas the first track focused only on performance metrics, the proposals submitted to the second track were evaluated both quantitatively and qualitatively. The quantitative evaluation focused on a combination of (i) the degree of mitigation of global temperature rise and (ii) the increase in economic productivity. On the other hand, an interdisciplinary panel of human experts in law, policy, sociology, economics and environmental science, evaluated the solutions qualitatively. In particular, the panel considered the effectiveness, simplicity, feasibility, ethics, and notions of climate justice of the protocols. In the third track, the participants were asked to critique and improve RICE-N.
International Institutions for Advanced AI
Lewis Ho
Joslyn N. Barnhart
Robert Frederic Trager
Miles Brundage
Allison Sovey Carnegie
Rumman Chowdhury
Allan Dafoe
Gillian K. Hadfield
Margaret Levi
D. Snidal
Generative Flow Networks: a Markov Chain Perspective
Tristan Deleu
Better Training of GFlowNets with Local Credit and Incomplete Trajectories
Ling Pan
Nikolay Malkin
Dinghuai Zhang
Generative Flow Networks or GFlowNets are related to Monte-Carlo Markov chain methods (as they sample from a distribution specified by an en… (see more)ergy function), reinforcement learning (as they learn a policy to sample composed objects through a sequence of steps), generative models (as they learn to represent and sample from a distribution) and amortized variational methods (as they can be used to learn to approximate and sample from an otherwise intractable posterior, given a prior and a likelihood). They are trained to generate an object
Discrete Key-Value Bottleneck
Frederik Träuble
Anirudh Goyal
Nasim Rahaman
Michael Curtis Mozer
Kenji Kawaguchi
Bernhard Schölkopf
Equivariance with Learned Canonicalization Functions
Sékou-Oumar Kaba
Arnab Kumar Mondal
Yan Zhang