Portrait of Yoshua Bengio

Yoshua Bengio

Core Academic Member
Canada CIFAR AI Chair
Full Professor, Université de Montréal, Department of Computer Science and Operations Research Department
Founder and Scientific Advisor, Leadership Team
Research Topics
Causality
Computational Neuroscience
Deep Learning
Generative Models
Graph Neural Networks
Machine Learning Theory
Medical Machine Learning
Molecular Modeling
Natural Language Processing
Probabilistic Models
Reasoning
Recurrent Neural Networks
Reinforcement Learning
Representation Learning

Biography

*For media requests, please write to medias@mila.quebec.

For more information please contact Marie-Josée Beauchamp, Administrative Assistant at marie-josee.beauchamp@mila.quebec.

Yoshua Bengio is recognized worldwide as a leading expert in AI. He is most known for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award, “the Nobel Prize of computing,” with Geoffrey Hinton and Yann LeCun.

Bengio is a full professor at Université de Montréal, and the founder and scientific advisor of Mila – Quebec Artificial Intelligence Institute. He is also a senior fellow at CIFAR and co-directs its Learning in Machines & Brains program, serves as special advisor and founding scientific director of IVADO, and holds a Canada CIFAR AI Chair.

In 2019, Bengio was awarded the prestigious Killam Prize and in 2022, he was the most cited computer scientist in the world by h-index. He is a Fellow of the Royal Society of London, Fellow of the Royal Society of Canada, Knight of the Legion of Honor of France and Officer of the Order of Canada. In 2023, he was appointed to the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Concerned about the social impact of AI, Bengio helped draft the Montréal Declaration for the Responsible Development of Artificial Intelligence and continues to raise awareness about the importance of mitigating the potentially catastrophic risks associated with future AI systems.

Current Students

Collaborating Alumni - McGill University
Collaborating Alumni - Université de Montréal
Collaborating researcher - Cambridge University
Principal supervisor :
PhD - Université de Montréal
Independent visiting researcher
Co-supervisor :
PhD - Université de Montréal
Collaborating researcher - N/A
Principal supervisor :
PhD - Université de Montréal
Collaborating researcher - KAIST
PhD - Université de Montréal
PhD - Université de Montréal
Research Intern - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Research Intern - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
Collaborating Alumni - Université de Montréal
Research Intern - Université de Montréal
Postdoctorate - Université de Montréal
Principal supervisor :
Collaborating researcher - Université de Montréal
Collaborating Alumni - Université de Montréal
Collaborating Alumni - Université de Montréal
Postdoctorate - Université de Montréal
Principal supervisor :
Collaborating Alumni
Collaborating Alumni - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Collaborating Alumni - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
Collaborating researcher - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
Postdoctorate - Université de Montréal
Principal supervisor :
Independent visiting researcher - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher - Ying Wu Coll of Computing
PhD - University of Waterloo
Principal supervisor :
Collaborating Alumni - Max-Planck-Institute for Intelligent Systems
PhD - Université de Montréal
Postdoctorate - Université de Montréal
Independent visiting researcher - Université de Montréal
Postdoctorate - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
Collaborating Alumni - Université de Montréal
Postdoctorate - Université de Montréal
Master's Research - Université de Montréal
Collaborating Alumni - Université de Montréal
Master's Research - Université de Montréal
Postdoctorate
Independent visiting researcher - Technical University of Munich
PhD - Université de Montréal
Co-supervisor :
Postdoctorate - Université de Montréal
Postdoctorate - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher - Université de Montréal
Collaborating researcher
Collaborating researcher - KAIST
PhD - McGill University
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
PhD - McGill University
Principal supervisor :

Publications

Neural Causal Structure Discovery from Interventions
Nan Rosemary Ke
Olexa Bilaniuk
Anirudh Goyal
Stefan Bauer
Bernhard Schölkopf
Michael Curtis Mozer
Recent promising results have generated a surge of interest in continuous optimization methods for causal discovery from observational data.… (see more) However, there are theoretical limitations on the identifiability of underlying structures obtained solely from observational data. Interventional data, on the other hand, provides richer information about the underlying data-generating process. Nevertheless, extending and applying methods designed for observational data to include interventions is a challenging problem. To address this issue, we propose a general framework based on neural networks to develop models that incorporate both observational and interventional data. Notably, our method can handle the challenging and realistic scenario where the identity of the intervened upon variable is unknown. We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (see more)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (see more)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (see more)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (see more)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (see more)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (see more)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Scientific discovery in the age of artificial intelligence
Hanchen Wang
Tianfan Fu
Yuanqi Du
Wenhao Gao
Kexin Huang
Ziming Liu
Payal Chandak
Shengchao Liu
Peter Van Katwyk
Andreea Deac
Animashree Anandkumar
K. Bergen
Carla P. Gomes
Shirley Ho
Pushmeet Kohli
Joan Lasenby
Jure Leskovec
Tie-Yan Liu
A. Manrai
Debora Susan Marks … (see 10 more)
Bharath Ramsundar
Le Song
Jimeng Sun
Petar Veličković
Max Welling
Linfeng Zhang
Connor Wilson. Coley
Marinka Žitnik
What if We Enrich day-ahead Solar Irradiance Time Series Forecasting with Spatio-Temporal Context?
Oussama Boussif
Ghait Boukachab
Dan Assouline
Stefano Massaroli
Tianle Yuan
The global integration of solar power into the electrical grid could have a crucial impact on climate change mitigation, yet poses a challen… (see more)ge due to solar irradiance variability. We present a deep learning architecture which uses spatio-temporal context from satellite data for highly accurate day-ahead time-series forecasting, in particular Global Horizontal Irradiance (GHI). We provide a multi-quantile variant which outputs a prediction interval for each time-step, serving as a measure of forecasting uncertainty. In addition, we suggest a testing scheme that separates easy and difficult scenarios, which appears useful to evaluate model performance in varying cloud conditions. Our approach exhibits robust performance in solar irradiance forecasting, including zero-shot generalization tests at unobserved solar stations, and holds great promise in promoting the effective use of solar power and the resulting reduction of CO
What if We Enrich day-ahead Solar Irradiance Time Series Forecasting with Spatio-Temporal Context?
Oussama Boussif
Ghait Boukachab
Dan Assouline
Stefano Massaroli
Tianle Yuan
Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation
Chris Emezue
Tristan Deleu
Stefan Bauer
The practical utility of causality in decision-making is widespread and brought about by the intertwining of causal discovery and causal inf… (see more)erence. Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. To address this gap, we evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets, on the downstream task of treatment effect estimation. Through the implementation of a distribution-level evaluation, we offer valuable and unique insights into the efficacy of these causal discovery methods for treatment effect estimation, considering both synthetic and real-world scenarios, as well as low-data scenarios. The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes, while some tend to learn many low-probability modes which impacts the (unrelaxed) recall and precision.