Portrait de Yoshua Bengio

Yoshua Bengio

Membre académique principal

Chaire en IA Canada-CIFAR

Professeur titulaire, Université de Montréal, Département d'informatique et de recherche opérationnelle

Fondateur et Conseiller scientifique, Équipe de direction

Sujets de recherche

Apprentissage automatique médical

Apprentissage de représentations

Apprentissage par renforcement

Apprentissage profond

Causalité

Modèles génératifs

Modèles probabilistes

Modélisation moléculaire

Neurosciences computationnelles

Raisonnement

Réseaux de neurones en graphes

Réseaux de neurones récurrents

Théorie de l'apprentissage automatique

Traitement du langage naturel

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Cassidy MacNeil, adjointe principale et responsable des opérations cassidy.macneil@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Collaborateur·rice alumni - McGill

Collaborateur·rice de recherche - Cambridge University

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Visiteur de recherche indépendant

Co-superviseur⋅e :

Guillaume Lajoie

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Collaborateur·rice de recherche - KAIST

Aniket Didolkar

Doctorat - UdeM

Abdessamad EL KABID

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Loubna Benabbou

Desmond Elliott

Visiteur de recherche indépendant

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Guillaume Lajoie

Doctorat - UdeM

Jean-Pierre Falet

Doctorat - UdeM

Doctorat

Doctorat - UdeM

Doctorat - UdeM

Thomas Jiralerspong

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Guillaume Lajoie

Younesse Kaddar

Collaborateur·rice alumni - UdeM

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Alex Hernandez-Garcia

Tabitha Edith Lee

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni

Collaborateur·rice alumni - UdeM

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Guillaume Lajoie

Visiteur de recherche indépendant - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Collaborateur·rice de recherche - University of Waterloo

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems

Collaborateur·rice de recherche - UdeM

Co-superviseur⋅e :

Loubna Benabbou

Jarrid Rector-Brooks

Doctorat - UdeM

Postdoctorat - UdeM

Postdoctorat - UdeM

Camille Rochefort-Boulanger

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Dragos Secrieru

Collaborateur·rice alumni - UdeM

Postdoctorat

Co-superviseur⋅e :

Alex Hernandez-Garcia

Collaborateur·rice alumni - Polytechnique

Co-superviseur⋅e :

Pierre-Luc Bacon

Mélisande Astrid Crystal Teng

Doctorat - UdeM

Co-superviseur⋅e :

Hugo Larochelle

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Siddarth Venkatraman

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche

Collaborateur·rice de recherche - UdeM

Doctorat - UdeM

Doctorat - McGill

Superviseur⋅e principal⋅e :

Mathieu Blanchette

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Aaron Courville

Collaborateur·rice alumni - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Scaling in the service of reasoning & model-based ML

4 avril 2023

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

par

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

par

Jake P. Taylor-King

Generative Flow Networks

15 mars 2022

Les réseaux de flot génératifs

par

Publications

Discrete, compositional, and symbolic representations through attractor dynamics

Andrew Nam

Chen Sun

Guillaume Lajoie

Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite ca… (voir plus)pacity despite a finite symbol set. It serves as a useful abstraction for reasoning in both cognitive science and in AI, yet the interface between continuous and symbolic processing is often imposed by fiat at the algorithmic level, such as by means of quantization or a softmax sampling step. In this work, we explore how discretization could be implemented in a more neurally plausible manner through the modeling of attractor dynamics that partition the continuous representation space into basins that correspond to sequences of symbols. Building on established work in attractor networks and introducing novel training methods, we show that imposing structure in the symbolic space can produce compositionality in the attractor-supported representation space of rich sensory inputs. Lastly, we argue that our model exhibits the process of an information bottleneck that is thought to play a role in conscious experience, decomposing the rich information of a sensory input into stable components encoding symbolic information.

2023-10-26

NeurIPS.cc/2023/Workshop/InfoCog (présentation orale)

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

Alvaro Carbonero

Alexandre Duval

Santiago Miret

Alex Hernandez-Garcia

The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (voir plus)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.

2023-10-26

NeurIPS.cc/2023/Workshop/AI4Mat (poster)

Towards equilibrium molecular conformation generation with GFlowNets

Alexandra Volokhova

Michał Koziarski

Alex Hernandez-Garcia

Cheng-Hao Liu

Santiago Miret

Luca Thiede

Alán Aspuru-Guzik

Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this pa… (voir plus)per we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and discovers a diverse set of low-energy conformations for highly flexible drug-like molecules. We demonstrate that GFlowNet can reproduce molecular potential energy surfaces by sampling proportionally to the Boltzmann distribution.

2023-10-26

NeurIPS.cc/2023/Workshop/AI4Mat (poster)

Managing extreme AI risks amid rapid progress

Geoffrey Hinton

Andrew Yao

Dawn Song

Pieter Abbeel

Yuval Noah Harari

Trevor Darrell

Ya-Qin Zhang

Lan Xue

Shai Shalev-Shwartz

Gillian Hadfield

Jeff Clune

Frank Hutter

Atilim Güneş Baydin

Sheila McIlraith

Qiqi Gao

Ashwin Acharya

David Krueger

Anca Dragan … (voir 5 de plus)

Philip Torr

Stuart Russell

Daniel Kahneman

Jan Brauner

Sören Mindermann

Artificial Intelligence (AI) is progressing rapidly, and companies are shifting their focus to developing generalist AI systems that can aut… (voir plus)onomously act and pursue goals. Increases in capabilities and autonomy may soon massively amplify AI's impact, with risks that include large-scale social harms, malicious uses, and an irreversible loss of human control over autonomous AI systems. Although researchers have warned of extreme risks from AI, there is a lack of consensus about how exactly such risks arise, and how to manage them. Society's response, despite promising first steps, is incommensurate with the possibility of rapid, transformative progress that is expected by many experts. AI safety research is lagging. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems. In this short consensus paper, we describe extreme risks from upcoming, advanced AI systems. Drawing on lessons learned from other safety-critical technologies, we then outline a comprehensive plan combining technical research and development with proactive, adaptive governance mechanisms for a more commensurate preparation.

2023-10-25

Science (publié)

Causal machine learning for single-cell genomics

Alejandro Tejada-Lapuerta

Hananeh Aliee

Fabian J. Theis

2023-10-22

ArXiv (prépublication)

A cry for help: Early detection of brain injury in newborns

Samantha Latremouille

Arsenii Gorin

Junhao Wang

Uchenna Ekwochi

P. Ubuane

O. Kehinde

Muhammad A. Salisu

Datonye Briggs

2023-10-11

ArXiv (prépublication)

Crystal-GFN: sampling crystals with desirable properties and constraints

Alex Hernandez-Garcia

Alexandre AGM Duval

Alexandra Volokhova

Pierre Luc Carrier

Michał Koziarski

Accelerating material discovery holds the potential to greatly help mitigate the climate crisis. Discovering new solid-state materials such … (voir plus)as electrocatalysts, super-ionic conductors or photovoltaic materials can have a crucial impact, for instance, in improving the efficiency of renewable energy production and storage. In this paper, we introduce Crystal-GFN, a generative model of crystal structures that sequentially samples structural properties of crystalline materials, namely the space group, composition and lattice parameters. This domain-inspired approach enables the flexible incorporation of physical and structural hard constraints, as well as the use of any available predictive model of a desired physicochemical property as an objective function. To design stable materials, one must target the candidates with the lowest formation energy. Here, we use as objective the formation energy per atom of a crystal structure predicted by a new proxy machine learning model trained on MatBench. The results demonstrate that Crystal-GFN is able to sample highly diverse crystals with low (median -3.1 eV/atom) predicted formation energy.

2023-10-06

ArXiv (prépublication)

Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

Trang Nguyen

Dianbo Liu

Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular proc… (voir plus)esses. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (assume acyclic structure) or struggle with scalability. We introduce Swift-DynGFN as a novel framework that enhances causal structure learning in GRNs while addressing scalability concerns. Specifically, Swift-DynGFN exploits gene-wise independence to boost parallelization and to lower computational cost. Experiments on real single-cell RNA velocity and synthetic GRN datasets showcase the advancement in learning causal structure in GRNs and scalability in larger systems.

2023-10-04

ArXiv (prépublication)

Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts in Underspecified Visual Tasks

Alexander Rubinstein

Armand Mihai Nicolicioiu

Damien Teney

Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to shortcut learning phenomena, where… (voir plus) a model may rely on erroneous, easy-to-learn, cues while ignoring reliable ones. In this work, we propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs). We discover that DPMs have the inherent capability to represent multiple visual cues independently, even when they are largely correlated in the training data. We leverage this characteristic to encourage model diversity and empirically show the efficacy of the approach with respect to several diversification objectives. We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.

2023-10-02

ArXiv (prépublication)

AI and Catastrophic Risk

2023-09-30

Journal of Democracy (publié)

RECOVER identifies synergistic drug combinations in vitro through sequential model optimization

Jarrid Rector-Brooks

Thomas Gaudelet

Andrew Anighoro

Torsten Gross

Francisco Martínez-Peña

Eileen L. Tang

M.S. Suraj

Cristian Regep

Jeremy B.R. Hayter

Maksym Korablyov

Nicholas Valiante

Almer Van Der Sloot

Mike Tyers

Charles E.S. Roberts

Michael M. Bronstein

Luke L. Lairson

Jake P. Taylor-King

2023-09-30

Cell Reports Methods (publié)

GEO-Bench: Toward Foundation Models for Earth Monitoring

Alexandre Lacoste

Nils Lehmann

Pau Rodríguez

Evan David Sherwin

Hannah Kerner

Björn Lütjens

Jeremy Irvin

David Dao

Hamed Alemohammad

Alexandre Drouin

Mehmet Gunturkun

Dava Newman

Stefano Ermon

Xiao Xiang Zhu

Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to subst… (voir plus)antial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.

2023-09-24

NeurIPS.cc/2023/Track/Datasets_and_Benchmarks (poster)