Portrait de Yoshua Bengio

Yoshua Bengio

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur titulaire, Université de Montréal, Département d'informatique et de recherche opérationnelle
Fondateur et Conseiller scientifique, Équipe de direction
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage par renforcement
Apprentissage profond
Causalité
Modèles génératifs
Modèles probabilistes
Modélisation moléculaire
Neurosciences computationnelles
Raisonnement
Réseaux de neurones en graphes
Réseaux de neurones récurrents
Théorie de l'apprentissage automatique
Traitement du langage naturel

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Marie-Josée Beauchamp, adjointe administrative à marie-josee.beauchamp@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Collaborateur·rice alumni - McGill
Collaborateur·rice alumni - UdeM
Collaborateur·rice de recherche - Cambridge University
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Collaborateur·rice alumni - Université du Québec à Rimouski
Visiteur de recherche indépendant
Co-superviseur⋅e :
Doctorat - UdeM
Collaborateur·rice alumni - UQAR
Collaborateur·rice de recherche - N/A
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Collaborateur·rice de recherche - KAIST
Collaborateur·rice alumni - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Doctorat - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Stagiaire de recherche - UdeM
Stagiaire de recherche - UdeM
Doctorat
Doctorat - UdeM
Maîtrise recherche - UdeM
Co-superviseur⋅e :
Collaborateur·rice alumni - UdeM
Stagiaire de recherche - UdeM
Postdoctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - UdeM
Collaborateur·rice alumni - UdeM
Collaborateur·rice alumni - UdeM
Collaborateur·rice alumni - UdeM
Postdoctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni
Doctorat - UdeM
Collaborateur·rice alumni - UdeM
Collaborateur·rice alumni - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Collaborateur·rice de recherche - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Postdoctorat - UdeM
Superviseur⋅e principal⋅e :
Visiteur de recherche indépendant - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - Ying Wu Coll of Computing
Doctorat - University of Waterloo
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems
Doctorat - UdeM
Postdoctorat - UdeM
Visiteur de recherche indépendant - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - UdeM
Maîtrise recherche - UdeM
Collaborateur·rice alumni - UdeM
Stagiaire de recherche - UdeM
Maîtrise recherche - UdeM
Visiteur de recherche indépendant - Technical University of Munich
Doctorat - UdeM
Co-superviseur⋅e :
Collaborateur·rice de recherche - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)
Superviseur⋅e principal⋅e :
Postdoctorat - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - UdeM
Collaborateur·rice alumni - UdeM
Collaborateur·rice de recherche
Collaborateur·rice de recherche - KAIST
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :

Publications

Improving *day-ahead* Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context
Oussama Boussif
Ghait Boukachab
Dan Assouline
Stefano Massaroli
Tianle Yuan
Solar power harbors immense potential in mitigating climate change by substantially reducing CO…
Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network
Tristan Deleu
Mizu Nishikawa-Toomey
Jithendaraa Subramanian
Nikolay Malkin
Generative Flow Networks (GFlowNets), a class of generative models over discrete and structured sample spaces, have been previously applied … (voir plus)to the problem of inferring the marginal posterior distribution over the directed acyclic graph (DAG) of a Bayesian Network, given a dataset of observations. Based on recent advances extending this framework to non-discrete sample spaces, we propose in this paper to approximate the joint posterior over not only the structure of a Bayesian Network, but also the parameters of its conditional probability distributions. We use a single GFlowNet whose sampling policy follows a two-phase process: the DAG is first generated sequentially one edge at a time, and then the corresponding parameters are picked once the full structure is known. Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models of the Bayesian Network, making our approach applicable even to non-linear models parametrized by neural networks. We show that our method, called JSP-GFN, offers an accurate approximation of the joint posterior, while comparing favorably against existing methods on both simulated and real data.
Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions
Stefano Massaroli
Michael Poli
Daniel Y Fu
Hermann Kumbong
Rom Nishijima Parnichkun
Aman Timalsina
David W. Romero
Quinn McIntyre
Beidi Chen
Atri Rudra
Ce Zhang
Christopher Re
Stefano Ermon
Recent advances in attention-free sequence models rely on convolutions as alternatives to the attention operator at the core of Transformers… (voir plus). In particular, long convolution sequence models have achieved state-of-the-art performance in many domains, but incur a significant cost during auto-regressive inference workloads -- naively requiring a full pass (or caching of activations) over the input sequence for each generated token -- similarly to attention-based models. In this paper, we seek to enable
Let the Flows Tell: Solving Graph Combinatorial Problems with GFlowNets
Dinghuai Zhang
Hanjun Dai
Nikolay Malkin
Ling Pan
Reusable Slotwise Mechanisms
Trang Nguyen
Amin Mansouri
Kanika Madan
Khuong N. Nguyen
Nguyen Duy Khuong
Kartik Ahuja
Dianbo Liu
Agents with the ability to comprehend and reason about the dynamics of objects would be expected to exhibit improved robustness and generali… (voir plus)zation in novel scenarios. However, achieving this capability necessitates not only an effective scene representation but also an understanding of the mechanisms governing interactions among object subsets. Recent studies have made significant progress in representing scenes using object slots. In this work, we introduce Reusable Slotwise Mechanisms, or RSM, a framework that models object dynamics by leveraging communication among slots along with a modular architecture capable of dynamically selecting reusable mechanisms for predicting the future states of each object slot. Crucially, RSM leverages the Central Contextual Information (CCI), enabling selected mechanisms to access the remaining slots through a bottleneck, effectively allowing for modeling of higher order and complex interactions that might require a sparse subset of objects. Experimental results demonstrate the superior performance of RSM compared to state-of-the-art methods across various future prediction and related downstream tasks, including Visual Question Answering and action planning. Furthermore, we showcase RSM's Out-of-Distribution generalization ability to handle scenes in intricate scenarios.
Neural Causal Structure Discovery from Interventions
Nan Rosemary Ke
Olexa Bilaniuk
Anirudh Goyal
Stefan Bauer
Bernhard Schölkopf
Michael Curtis Mozer
Recent promising results have generated a surge of interest in continuous optimization methods for causal discovery from observational data.… (voir plus) However, there are theoretical limitations on the identifiability of underlying structures obtained solely from observational data. Interventional data, on the other hand, provides richer information about the underlying data-generating process. Nevertheless, extending and applying methods designed for observational data to include interventions is a challenging problem. To address this issue, we propose a general framework based on neural networks to develop models that incorporate both observational and interventional data. Notably, our method can handle the challenging and realistic scenario where the identity of the intervened upon variable is unknown. We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (voir plus)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (voir plus)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (voir plus)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (voir plus)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Patrick Mark Butlin
R. Long
Eric Elmoznino
Jonathan C. P. Birch
Axel Constant
George Deane
S. Fleming
C. Frith
Xuanxiu Ji
Ryota Kanai
C. Klein
Grace W. Lindsay
Matthias Michel
Liad Mudrik
Megan A. K. Peters
Eric Schwitzgebel
Jonathan Simon
Rufin Vanrullen
Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argu… (voir plus)es for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive"indicator properties"of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.