Portrait de Yoshua Bengio

Yoshua Bengio

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur titulaire, Université de Montréal, Département d'informatique et de recherche opérationnelle
Directeur scientifique, Équipe de direction
Observateur, Conseil d'administration, Mila

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Julie Mongeau, adjointe de direction à julie.mongeau@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et directeur scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de directeur scientifique d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Stagiaire de recherche - Université du Québec à Rimouski
Maîtrise professionnelle - UdeM
Visiteur de recherche indépendant
Co-superviseur⋅e :
Visiteur de recherche indépendant - UQAR
Stagiaire de recherche - UQAR
Visiteur de recherche indépendant - MIT
Postdoctorat - UdeM
Co-superviseur⋅e :
Maîtrise professionnelle - UdeM
Collaborateur·rice de recherche - Université Paris-Saclay
Superviseur⋅e principal⋅e :
Doctorat - Massachusetts Institute of Technology
Doctorat - Barcelona University
Maîtrise professionnelle - UdeM
Maîtrise professionnelle - UdeM
Collaborateur·rice de recherche
Visiteur de recherche indépendant - Technical University Munich (TUM)
Collaborateur·rice de recherche - UdeM
Collaborateur·rice alumni
Maîtrise professionnelle - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Stagiaire de recherche - Imperial College London
Stagiaire de recherche - UdeM
Collaborateur·rice de recherche - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Maîtrise professionnelle - UdeM
Visiteur de recherche indépendant - UdeM
Visiteur de recherche indépendant - Hong Kong University of Science and Technology (HKUST)
Collaborateur·rice de recherche - Ying Wu Coll of Computing
Maîtrise professionnelle - UdeM
Doctorat - Max-Planck-Institute for Intelligent Systems
Maîtrise professionnelle - UdeM
Visiteur de recherche indépendant - UdeM
Visiteur de recherche indépendant - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche
Superviseur⋅e principal⋅e :
Maîtrise recherche - UdeM
Maîtrise professionnelle - UdeM
Visiteur de recherche indépendant - Technical University of Munich
Doctorat - École Polytechnique Fédérale de Lausanne
Collaborateur·rice de recherche
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - Valence
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)
Superviseur⋅e principal⋅e :
Maîtrise professionnelle - UdeM
Collaborateur·rice alumni - UdeM
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :

Publications

Unlearning via Sparse Representations
Vedant Shah
Frederik Träuble
Ashish Malik
Michael Curtis Mozer
Sanjeev Arora
Anirudh Goyal
Mitigating Biases with Diverse Ensembles and Diffusion Models
Luca Scimeca
Alexander Rubinstein
Damien Teney
Seong Joon Oh
Armand Nicolicioiu
Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to a phenomenon known as shortcut lea… (voir plus)rning, where a model relies on erroneous, easy-to-learn cues while ignoring reliable ones. In this work, we propose an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs) to mitigate this form of bias. We show that at particular training intervals, DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features. We leverage this crucial property to generate synthetic counterfactuals to increase model diversity via ensemble disagreement. We show that DPM-guided diversification is sufficient to remove dependence on primary shortcut cues, without a need for additional supervised signals. We further empirically quantify its efficacy on several diversification objectives, and finally show improved generalization and diversification performance on par with prior work that relies on auxiliary data collection.
Learning from unexpected events in the neocortical microcircuit
Colleen J Gillon
Jason E. Pina
Jérôme A. Lecoq
Ruweida Ahmed
Yazan N. Billeh
Shiella Caldejon
Peter Groblewski
Timothy M. Henley
India Kato
Eric Lee
Jennifer Luviano
Kyla Mace
Chelsea Nayan
Thuyanh V. Nguyen
Kat North
Jed Perkins
Sam Seid
Matthew T. Valley
Ali Williford
Timothy P. Lillicrap
Joel Zylberberg
Responses to Pattern-Violating Visual Stimuli Evolve Differently Over Days in Somata and Distal Apical Dendrites
Colleen J Gillon
Jason E. Pina
Jérôme A. Lecoq
Ruweida Ahmed
Yazan N. Billeh
Shiella Caldejon
Peter Groblewski
Timothy M. Henley
India Kato
Eric Lee
Jennifer Luviano
Kyla Mace
Chelsea Nayan
Thuyanh V. Nguyen
Kat North
Jed Perkins
Sam Seid
Matthew T. Valley
Ali Williford
Timothy P. Lillicrap
Joel Zylberberg
Scientists have long conjectured that the neocortex learns patterns in sensory data to generate top-down predictions of upcoming stimuli. In… (voir plus) line with this conjecture, different responses to pattern-matching vs pattern-violating visual stimuli have been observed in both spiking and somatic calcium imaging data. However, it remains unknown whether these pattern-violation signals are different between the distal apical dendrites, which are heavily targeted by top-down signals, and the somata, where bottom-up information is primarily integrated. Furthermore, it is unknown how responses to pattern-violating stimuli evolve over time as an animal gains more experience with them. Here, we address these unanswered questions by analyzing responses of individual somata and dendritic branches of layer 2/3 and layer 5 pyramidal neurons tracked over multiple days in primary visual cortex of awake, behaving female and male mice. We use sequences of Gabor patches with patterns in their orientations to create pattern-matching and pattern-violating stimuli, and two-photon calcium imaging to record neuronal responses. Many neurons in both layers show large differences between their responses to pattern-matching and pattern-violating stimuli. Interestingly, these responses evolve in opposite directions in the somata and distal apical dendrites, with somata becoming less sensitive to pattern-violating stimuli and distal apical dendrites more sensitive. These differences between the somata and distal apical dendrites may be important for hierarchical computation of sensory predictions and learning, since these two compartments tend to receive bottom-up and top-down information, respectively.
SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data
Mélisande Teng
Amna Elmustafa
Benjamin Akera
Hager Radi
Biodiversity is declining at an unprecedented rate, impacting ecosystem services necessary to ensure food, water, and human health and well-… (voir plus)being. Understanding the distribution of species and their habitats is crucial for conservation policy planning. However, traditional methods in ecology for species distribution models (SDMs) generally focus either on narrow sets of species or narrow geographical areas and there remain significant knowledge gaps about the distribution of species. A major reason for this is the limited availability of data traditionally used, due to the prohibitive amount of effort and expertise required for traditional field monitoring. The wide availability of remote sensing data and the growing adoption of citizen science tools to collect species observations data at low cost offer an opportunity for improving biodiversity monitoring and enabling the modelling of complex ecosystems. We introduce a novel task for mapping bird species to their habitats by predicting species encounter rates from satellite images, and present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird, considering summer (breeding) and winter seasons. We also provide a dataset in Kenya representing low-data regimes. We additionally provide environmental data and species range maps for each location. We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks. SatBird opens up possibilities for scalably modelling properties of ecosystems worldwide.
Generative AI models should include detection mechanisms as a condition for public release
Alistair Knott
Dino Pedreschi
Raja Chatila
Tapabrata Chakraborti
Susan Leavy
Ricardo Baeza-Yates
D. Eyers
Andrew Trotman
Paul D. Teal
Przemyslaw Biecek
Stuart Russell
OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning
Rim Assouel
Pau Rodriguez
Perouz Taslakian
David Vazquez
Attention Schema in Neural Agents
Dianbo Liu
Samuele Bolotta
Mike He Zhu
Zahra Sheikhbahaee
Attention has become a common ingredient in deep learning architectures. It adds a dynamical selection of information on top of the static s… (voir plus)election of information supported by weights. In the same way, we can imagine a higher-order informational filter built on top of attention: an Attention Schema (AS), namely, a descriptive and predictive model of attention. In cognitive neuroscience, Attention Schema Theory (AST) supports this idea of distinguishing attention from AS. A strong prediction of this theory is that an agent can use its own AS to also infer the states of other agents' attention and consequently enhance coordination with other agents. As such, multi-agent reinforcement learning would be an ideal setting to experimentally test the validity of AST. We explore different ways in which attention and AS interact with each other. Our preliminary results indicate that agents that implement the AS as a recurrent internal control achieve the best performance. In general, these exploratory experiments suggest that equipping artificial agents with a model of attention can enhance their social intelligence.
Baking Symmetry into GFlowNets
George Ma
Emmanuel Bengio
Dinghuai Zhang
GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects increment… (voir plus)ally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphic states. This lack of symmetry increases the amount of samples required for training GFlowNets and can result in inefficient and potentially incorrect flow functions. As a consequence, the reward and diversity of the generated objects decrease. In this study, our objective is to integrate symmetries into GFlowNets by identifying equivalent actions during the generation process. Experimental results using synthetic data demonstrate the promising performance of our proposed approaches.
Baking Symmetry into GFlowNets
George Ma
Emmanuel Bengio
Dinghuai Zhang
GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects increment… (voir plus)ally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphic states. This lack of symmetry increases the amount of samples required for training GFlowNets and can result in inefficient and potentially incorrect flow functions. As a consequence, the reward and diversity of the generated objects decrease. In this study, our objective is to integrate symmetries into GFlowNets by identifying equivalent actions during the generation process. Experimental results using synthetic data demonstrate the promising performance of our proposed approaches.
Causal Discovery in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems
Trang Nguyen
Alexander Tong
Kanika Madan
Dianbo Liu
Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular proc… (voir plus)esses. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (assume acyclic structure) or struggle with scalability. We introduce Swift-DynGFN as a novel framework that enhances causal structure learning in GRNs while addressing scalability concerns. Specifically, Swift-DynGFN exploits gene-wise independence to boost parallelization and to lower computational cost. Experiments on real single-cell RNA velocity and synthetic GRN datasets showcase the advancement in learning causal structure in GRNs and scalability in larger systems.
Crystal-GFN: sampling materials with desirable properties and constraints
Mistal
Alex Hernandez-Garcia
Alexandra Volokhova
Alexandre AGM Duval
Divya Sharma
pierre luc carrier
Michał Koziarski
Victor Schmidt