Portrait de Yoshua Bengio

Yoshua Bengio

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur titulaire, Université de Montréal, Département d'informatique et de recherche opérationnelle
Fondateur et Conseiller scientifique, Équipe de direction
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage par renforcement
Apprentissage profond
Causalité
Modèles génératifs
Modèles probabilistes
Modélisation moléculaire
Neurosciences computationnelles
Raisonnement
Réseaux de neurones en graphes
Réseaux de neurones récurrents
Théorie de l'apprentissage automatique
Traitement du langage naturel

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Cassidy MacNeil, adjointe principale et responsable des opérations cassidy.macneil@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Collaborateur·rice alumni - McGill
Collaborateur·rice de recherche - Cambridge University
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Visiteur de recherche indépendant
Co-superviseur⋅e :
Collaborateur·rice de recherche - N/A
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Collaborateur·rice de recherche - KAIST
Collaborateur·rice alumni - UdeM
Co-superviseur⋅e :
Visiteur de recherche indépendant
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - UdeM
Postdoctorat - UdeM
Superviseur⋅e principal⋅e :
Postdoctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni
Collaborateur·rice alumni - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Visiteur de recherche indépendant - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - Ying Wu Coll of Computing
Collaborateur·rice de recherche - University of Waterloo
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems
Collaborateur·rice de recherche - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Postdoctorat - UdeM
Postdoctorat - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - UdeM
Postdoctorat
Co-superviseur⋅e :
Collaborateur·rice alumni - Polytechnique
Co-superviseur⋅e :
Doctorat - UdeM
Co-superviseur⋅e :
Collaborateur·rice de recherche
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - UdeM
Collaborateur·rice alumni - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche
Collaborateur·rice de recherche - UdeM
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - McGill
Superviseur⋅e principal⋅e :

Publications

A community effort in SARS-CoV-2 drug discovery.
Johannes Schimunek
Philipp Seidl
Katarina Elez
Tim Hempel
Tuan Le
Frank Noé
Simon Olsson
Lluís Raich
Robin Winter
Hatice Gokcan
Filipp Gusev
Evgeny M. Gutkin
Olexandr Isayev
Maria G. Kurnikova
Chamali H. Narangoda
Roman Zubatyuk
Ivan P. Bosko
Konstantin V. Furs
Anna D. Karpenko
Yury V. Kornoushenko … (voir 133 de plus)
Mikita Shuldau
Artsemi Yushkevich
Mohammed B. Benabderrahmane
Patrick Bousquet‐Melou
Ronan Bureau
Beatrice Charton
Bertrand C. Cirou
Gérard Gil
William J. Allen
Suman Sirimulla
Stanley Watowich
Nick Antonopoulos
Nikolaos Epitropakis
Agamemnon Krasoulis
Vassilis Pitsikalis
Stavros Theodorakis
Igor Kozlovskii
Anton Maliutin
Alexander Medvedev
Petr Popov
Mark Zaretckii
Hamid Eghbal‐Zadeh
Christina Halmich
Sepp Hochreiter
Andreas Mayr
Peter Ruch
Michael Widrich
Francois Berenger
Ashutosh Kumar
Yoshihiro Yamanishi
Kam Y. J. Zhang
Moksh J. Jain
Cheng-Hao Liu
Gilles Marcou
M. Gilles
Enrico Glaab
Kelly Barnsley
Suhasini M. Iyengar
Mary Jo Ondrechen
V. Joachim Haupt
Florian Kaiser
Michael Schroeder
Luisa Pugliese
Simone Albani
Christina Athanasiou
Andrea Beccari
Paolo Carloni
Giulia D'Arrigo
Eleonora Gianquinto
Jonas Goßen
Anton Hanke
Benjamin P. Joseph
Daria B. Kokh
Sandra Kovachka
Candida Manelfi
Goutam Mukherjee
Abraham Muñiz‐Chicharro
Francesco Musiani
Ariane Nunes‐Alves
Giulia Paiardi
Giulia Rossetti
S. Kashif Sadiq
Francesca Spyrakis
Carmine Talarico
Alexandros Tsengenes
Rebecca C. Wade
Conner Copeland
Jeremiah Gaiser
Daniel R. Olson
Amitava Roy
Vishwesh Venkatraman
Travis J. Wheeler
Haribabu Arthanari
Klara Blaschitz
Marco Cespugli
Vedat Durmaz
Konstantin Fackeldey
Patrick D. Fischer
Christoph Gorgulla
Christian Gruber
Karl Gruber
Michael Hetmann
Jamie E. Kinney
Krishna M. Padmanabha Das
Shreya Pandita
Amit Singh
Georg Steinkellner
Guilhem Tesseyre
Gerhard Wagner
Zi‐Fu Wang
Ryan J. Yust
Dmitry S. Druzhilovskiy
Dmitry A. Filimonov
Pavel V. Pogodin
Vladimir Poroikov
Anastassia V. Rudik
Leonid A. Stolbov
Alexander V. Veselovsky
Maria De Rosa
Giada De Simone
Maria R. Gulotta
Jessica Lombino
Nedra Mekni
Ugo Perricone
Arturo Casini
Amanda Embree
D. Benjamin Gordon
David Lei
Katelin Pratt
Christopher A. Voigt
Kuang‐Yu Chen
Yves Jacob
Tim Krischuns
Pierre Lafaye
Agnès Zettor
M. Luis Rodríguez
Kris M. White
Daren Fearon
Frank Von Delft
Martin A. Walsh
Dragos Horvath
Charles L. Brooks
Babak Falsafi
Bryan Ford
Adolfo García‐Sastre
Sang Yup Lee
Nadia Naffakh
Alexandre Varnek
Günter Klambauer
Thomas M. Hermans
The COVID-19 pandemic continues to pose a substantial threat to human lives and is likely to do so for years to come. Despite the availabili… (voir plus)ty of vaccines, searching for efficient small-molecule drugs that are widely available, including in low- and middle-income countries, is an ongoing challenge. In this work, we report the results of an open science community effort, the "Billion molecules against Covid-19 challenge", to identify small-molecule inhibitors against SARS-CoV-2 or relevant human receptors. Participating teams used a wide variety of computational methods to screen a minimum of 1 billion virtual molecules against 6 protein targets. Overall, 31 teams participated, and they suggested a total of 639,024 molecules, which were subsequently ranked to find 'consensus compounds'. The organizing team coordinated with various contract research organizations (CROs) and collaborating institutions to synthesize and test 878 compounds for biological activity against proteases (Nsp5, Nsp3, TMPRSS2), nucleocapsid N, RdRP (only the Nsp12 domain), and (alpha) spike protein S. Overall, 27 compounds with weak inhibition/binding were experimentally identified by binding-, cleavage-, and/or viral suppression assays and are presented here. Open science approaches such as the one presented here contribute to the knowledge base of future drug discovery efforts in finding better SARS-CoV-2 treatments.
SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data
Mélisande Teng
Amna Elmustafa
Benjamin Akera
Hager Radi Abdelwahed
Generative AI models should include detection mechanisms as a condition for public release
Alistair Knott
Dino Pedreschi
Raja Chatila
Tapabrata Chakraborti
Susan Leavy
Ricardo Baeza-Yates
David Eyers
Andrew Trotman
Paul D. Teal
Przemyslaw Biecek
Stuart Russell
The new wave of ‘foundation models’—general-purpose generative AI models, for production of text (e.g., ChatGPT) or images (e.g., MidJ… (voir plus)ourney)—represent a dramatic advance in the state of the art for AI. But their use also introduces a range of new risks, which has prompted an ongoing conversation about possible regulatory mechanisms. Here we propose a specific principle that should be incorporated into legislation: that any organization developing a foundation model intended for public use must demonstrate a reliable detection mechanism for the content it generates, as a condition of its public release. The detection mechanism should be made publicly available in a tool that allows users to query, for an arbitrary item of content, whether the item was generated (wholly or partly) by the model. In this paper, we argue that this requirement is technically feasible and would play an important role in reducing certain risks from new AI models in many domains. We also outline a number of options for the tool’s design, and summarize a number of points where further input from policymakers and researchers would be required.
OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning
A key aspect of human intelligence is the ability to imagine -- composing learned concepts in novel ways -- to make sense of new scenarios. … (voir plus)Such capacity is not yet attained for machine learning systems. In this work, in the context of visual reasoning, we show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination. Our method, denoted Object-centric Compositional Neural Module Network (OC-NMN), decomposes visual generative reasoning tasks into a series of primitives applied to objects without using a domain-specific language. We show that our modular architectural choices can be used to generate new training tasks that lead to better out-of-distribution generalization. We compare our model to existing and new baselines in proposed visual reasoning benchmark that consists of applying arithmetic operations to MNIST digits.
Attention Schema in Neural Agents
Dianbo Liu
Samuele Bolotta
Mike He Zhu
Attention has become a common ingredient in deep learning architectures. It adds a dynamical selection of information on top of the static s… (voir plus)election of information supported by weights. In the same way, we can imagine a higher-order informational filter built on top of attention: an Attention Schema (AS), namely, a descriptive and predictive model of attention. In cognitive neuroscience, Attention Schema Theory (AST) supports this idea of distinguishing attention from AS. A strong prediction of this theory is that an agent can use its own AS to also infer the states of other agents' attention and consequently enhance coordination with other agents. As such, multi-agent reinforcement learning would be an ideal setting to experimentally test the validity of AST. We explore different ways in which attention and AS interact with each other. Our preliminary results indicate that agents that implement the AS as a recurrent internal control achieve the best performance. In general, these exploratory experiments suggest that equipping artificial agents with a model of attention can enhance their social intelligence.
Baking Symmetry into GFlowNets
GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects increment… (voir plus)ally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphic states. This lack of symmetry increases the amount of samples required for training GFlowNets and can result in inefficient and potentially incorrect flow functions. As a consequence, the reward and diversity of the generated objects decrease. In this study, our objective is to integrate symmetries into GFlowNets by identifying equivalent actions during the generation process. Experimental results using synthetic data demonstrate the promising performance of our proposed approaches.
Causal Discovery in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems
Trang Nguyen
Dianbo Liu
Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular proc… (voir plus)esses. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (assume acyclic structure) or struggle with scalability. We introduce Swift-DynGFN as a novel framework that enhances causal structure learning in GRNs while addressing scalability concerns. Specifically, Swift-DynGFN exploits gene-wise independence to boost parallelization and to lower computational cost. Experiments on real single-cell RNA velocity and synthetic GRN datasets showcase the advancement in learning causal structure in GRNs and scalability in larger systems.
Crystal-GFN: sampling materials with desirable properties and constraints
Discrete, compositional, and symbolic representations through attractor dynamics
Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite ca… (voir plus)pacity despite a finite symbol set. It serves as a useful abstraction for reasoning in both cognitive science and in AI, yet the interface between continuous and symbolic processing is often imposed by fiat at the algorithmic level, such as by means of quantization or a softmax sampling step. In this work, we explore how discretization could be implemented in a more neurally plausible manner through the modeling of attractor dynamics that partition the continuous representation space into basins that correspond to sequences of symbols. Building on established work in attractor networks and introducing novel training methods, we show that imposing structure in the symbolic space can produce compositionality in the attractor-supported representation space of rich sensory inputs. Lastly, we argue that our model exhibits the process of an information bottleneck that is thought to play a role in conscious experience, decomposing the rich information of a sensory input into stable components encoding symbolic information.
On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions
The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (voir plus)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.
Towards equilibrium molecular conformation generation with GFlowNets
Cheng-Hao Liu
Santiago Miret
Luca Thiede
Alán Aspuru-Guzik
Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this pa… (voir plus)per we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and discovers a diverse set of low-energy conformations for highly flexible drug-like molecules. We demonstrate that GFlowNet can reproduce molecular potential energy surfaces by sampling proportionally to the Boltzmann distribution.
Managing extreme AI risks amid rapid progress
Geoffrey Hinton
Andrew Yao
Dawn Song
Pieter Abbeel
Yuval Noah Harari
Trevor Darrell
Ya-Qin Zhang
Lan Xue
Shai Shalev-Shwartz
Gillian Hadfield
Jeff Clune
Frank Hutter
Atilim Güneş Baydin
Sheila McIlraith
Qiqi Gao
Ashwin Acharya
David Krueger
Anca Dragan … (voir 5 de plus)
Philip Torr
Stuart Russell
Daniel Kahneman
Jan Brauner
Artificial Intelligence (AI) is progressing rapidly, and companies are shifting their focus to developing generalist AI systems that can aut… (voir plus)onomously act and pursue goals. Increases in capabilities and autonomy may soon massively amplify AI's impact, with risks that include large-scale social harms, malicious uses, and an irreversible loss of human control over autonomous AI systems. Although researchers have warned of extreme risks from AI, there is a lack of consensus about how exactly such risks arise, and how to manage them. Society's response, despite promising first steps, is incommensurate with the possibility of rapid, transformative progress that is expected by many experts. AI safety research is lagging. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems. In this short consensus paper, we describe extreme risks from upcoming, advanced AI systems. Drawing on lessons learned from other safety-critical technologies, we then outline a comprehensive plan combining technical research and development with proactive, adaptive governance mechanisms for a more commensurate preparation.