Yoshua Bengio

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Julie Mongeau, adjointe de direction à julie.mongeau@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et directeur scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de directeur scientifique d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Stagiaire de recherche - McGill

Mohammed Abukalam

Stagiaire de recherche - UdeM

Rim Assouel

Doctorat - UdeM

Dan Assouline

Collaborateur·rice alumni

Ayoub Atanane

Stagiaire de recherche - Université du Québec à Rimouski

Stefan Bauer

Visiteur de recherche indépendant

Co-superviseur⋅e :

Guillaume Lajoie

Paul Bertin

Doctorat - UdeM

Ghait Boukachab

Stagiaire de recherche - UQAR

Doctorat - UdeM

Visiteur de recherche indépendant - MIT

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Chen Chen

Postdoctorat - UdeM

Co-superviseur⋅e :

Blake Richards

Xiaoyin Chen

Doctorat - UdeM

Pierre-Paul De Breuck

Collaborateur·rice alumni - UdeM

Doctorat - UdeM

Doctorat - UdeM

Collaborateur·rice de recherche - Université Paris-Saclay

Superviseur⋅e principal⋅e :

Eric Elmoznino

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - Massachusetts Institute of Technology

Léna Nehale Ezzine

Doctorat - UdeM

Jean-Pierre Falet

Doctorat - UdeM

Co-superviseur⋅e :

Leo Feng

Doctorat - UdeM

Stagiaire de recherche - Barcelona University

Piotr Gainski

Stagiaire de recherche - UdeM

Ivan Grega

Collaborateur·rice de recherche - UdeM

Pietro Greiner

Stagiaire de recherche

Mohsin Hasan

Doctorat - UdeM

mohsin.hasan@mila.quebec

Alex Hernandez-Garcia

Postdoctorat - UdeM

Co-superviseur⋅e :

Leon Hetzel

Visiteur de recherche indépendant - Technical University Munich (TUM)

Edward Hu

Doctorat - UdeM

Moksh Jain

Doctorat - UdeM

moksh.jain@mila.quebec

Stagiaire de recherche - UdeM

Maîtrise recherche - UdeM

Co-superviseur⋅e :

Stagiaire de recherche - UdeM

Minsu Kim

Collaborateur·rice de recherche - UdeM

Doctorat - UdeM

Postdoctorat - UdeM

Doctorat - UdeM

Collaborateur·rice alumni

Seanie Lee

Collaborateur·rice alumni - UdeM

Zhen Liu

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Liam Paull

Chenghao Liu

Collaborateur·rice alumni

Stagiaire de recherche - Imperial College London

Doctorat - UdeM

Stagiaire de recherche - UdeM

Nikolay Malkin

Collaborateur·rice alumni - UdeM

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Postdoctorat - UdeM

Collaborateur·rice alumni

Sören Mindermann

Collaborateur·rice de recherche - UdeM

Sarthak Mittal

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Visiteur de recherche indépendant - UdeM

Ling Pan

Visiteur de recherche indépendant - Hong Kong University of Science and Technology (HKUST)

Ali Parviz

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Lena Podina

Doctorat - University of Waterloo

Superviseur⋅e principal⋅e :

Nassim Rahaman

Doctorat - Max-Planck-Institute for Intelligent Systems

Jarrid Rector-Brooks

Doctorat - UdeM

Co-superviseur⋅e :

Sarath Chandar

Danyal REHMAN

Postdoctorat - UdeM

James Requeima

Visiteur de recherche indépendant - UdeM

Postdoctorat - UdeM

Jessie Richter-Powell

Visiteur de recherche indépendant - UdeM

Camille Rochefort-Boulanger

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Julie Hussin

agassoussisalwane2@gmail.com

Salwane Salwane

Stagiaire de recherche - UdeM

Theo Saulus

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Postdoctorat - UdeM

Maîtrise recherche - UdeM

Marcin Sendera

Stagiaire de recherche - UdeM

Dounia Shaaban Kabakibo

Stagiaire de recherche - UdeM

Vedant Shah

Maîtrise recherche - UdeM

Collaborateur·rice alumni

Marco Stock

Visiteur de recherche indépendant - Technical University of Munich

marco.stock@tum.de

Anja Surina

Doctorat - École Polytechnique Fédérale de Lausanne

Vincent Taboga

Postdoctorat - Polytechnique

Co-superviseur⋅e :

Pierre-Luc Bacon

Mélisande Astrid Crystal Teng

Doctorat - UdeM

Co-superviseur⋅e :

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

alexander.tong@mila.quebec

Alex Tong

Postdoctorat - UdeM

Collaborateur·rice de recherche - Valence

Superviseur⋅e principal⋅e :

Dominique Beaini

Donna Vakalis

Postdoctorat - UdeM

Co-superviseur⋅e :

Viktor Viktor Todosijevic

Collaborateur·rice de recherche - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)

Superviseur⋅e principal⋅e :

Sasha Volokhova

Doctorat - UdeM

Zichao Yan

Collaborateur·rice alumni - UdeM

Kyle YUN

Collaborateur·rice de recherche - KAIST

Elmimouni Zakaria

Stagiaire de recherche - UdeM

Nicole Zhang

Doctorat - McGill

Superviseur⋅e principal⋅e :

Mathieu Blanchette

Dinghuai Zhang

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Aaron Courville

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Ruixiang Zhang

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Harry Zhao

Doctorat - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

Scaling in the service of reasoning & model-based ML

4 avril 2023

par

Yoshua Bengio

Edward J. Hu

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

par

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

Les réseaux de flot génératifs

15 mars 2022

par

Yoshua Bengio

Publications

SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data

Mélisande Teng

Amna Elmustafa

Benjamin Akera

Hager Radi

Hugo Larochelle

Biodiversity is declining at an unprecedented rate, impacting ecosystem services necessary to ensure food, water, and human health and well-… (voir plus)being. Understanding the distribution of species and their habitats is crucial for conservation policy planning. However, traditional methods in ecology for species distribution models (SDMs) generally focus either on narrow sets of species or narrow geographical areas and there remain significant knowledge gaps about the distribution of species. A major reason for this is the limited availability of data traditionally used, due to the prohibitive amount of effort and expertise required for traditional field monitoring. The wide availability of remote sensing data and the growing adoption of citizen science tools to collect species observations data at low cost offer an opportunity for improving biodiversity monitoring and enabling the modelling of complex ecosystems. We introduce a novel task for mapping bird species to their habitats by predicting species encounter rates from satellite images, and present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird, considering summer (breeding) and winter seasons. We also provide a dataset in Kenya representing low-data regimes. We additionally provide environmental data and species range maps for each location. We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks. SatBird opens up possibilities for scalably modelling properties of ecosystems worldwide.

2023-11-02

ArXiv (prépublication)

arxiv.org

Generative AI models should include detection mechanisms as a condition for public release

Alistair Knott

Dino Pedreschi

Raja Chatila

Tapabrata Chakraborti

Susan Leavy

Ricardo Baeza-Yates

D. Eyers

Andrew Trotman

Paul D. Teal

Przemyslaw Biecek

Stuart Russell

2023-10-28

Ethics and Information Technology (publié)

OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning

Rim Assouel

Pau Rodriguez

Perouz Taslakian

David Vazquez

2023-10-28

ArXiv (prépublication)

arxiv.org

Attention Schema in Neural Agents

Dianbo Liu

Samuele Bolotta

Mike He Zhu

Zahra Sheikhbahaee

Guillaume Dumas

Attention has become a common ingredient in deep learning architectures. It adds a dynamical selection of information on top of the static s… (voir plus)election of information supported by weights. In the same way, we can imagine a higher-order informational filter built on top of attention: an Attention Schema (AS), namely, a descriptive and predictive model of attention. In cognitive neuroscience, Attention Schema Theory (AST) supports this idea of distinguishing attention from AS. A strong prediction of this theory is that an agent can use its own AS to also infer the states of other agents' attention and consequently enhance coordination with other agents. As such, multi-agent reinforcement learning would be an ideal setting to experimentally test the validity of AST. We explore different ways in which attention and AS interact with each other. Our preliminary results indicate that agents that implement the AS as a recurrent internal control achieve the best performance. In general, these exploratory experiments suggest that equipping artificial agents with a model of attention can enhance their social intelligence.

2023-10-27

NeurIPS.cc/2023/Workshop/InfoCog (poster)

Baking Symmetry into GFlowNets

George Ma

Emmanuel Bengio

Dinghuai Zhang

GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects increment… (voir plus)ally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphic states. This lack of symmetry increases the amount of samples required for training GFlowNets and can result in inefficient and potentially incorrect flow functions. As a consequence, the reward and diversity of the generated objects decrease. In this study, our objective is to integrate symmetries into GFlowNets by identifying equivalent actions during the generation process. Experimental results using synthetic data demonstrate the promising performance of our proposed approaches.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Science (présentation orale)

Baking Symmetry into GFlowNets

George Ma

Emmanuel Bengio

Dinghuai Zhang

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Science (présentation orale)

Causal Discovery in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

Trang Nguyen

Alexander Tong

Kanika Madan

Dianbo Liu

Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular proc… (voir plus)esses. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (assume acyclic structure) or struggle with scalability. We introduce Swift-DynGFN as a novel framework that enhances causal structure learning in GRNs while addressing scalability concerns. Specifically, Swift-DynGFN exploits gene-wise independence to boost parallelization and to lower computational cost. Experiments on real single-cell RNA velocity and synthetic GRN datasets showcase the advancement in learning causal structure in GRNs and scalability in larger systems.

2023-10-27

NeurIPS.cc/2023/Workshop/GenBio (poster)

Crystal-GFN: sampling materials with desirable properties and constraints

Mistal

Alex Hernandez-Garcia

Alexandra Volokhova

Alexandre AGM Duval

Divya Sharma

pierre luc carrier

Michał Koziarski

Victor Schmidt

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Mat (spotlight)

Discrete, compositional, and symbolic representations through attractor dynamics

Andrew Nam

Eric Elmoznino

Nikolay Malkin

Chen Sun

Guillaume Lajoie

Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite ca… (voir plus)pacity despite a finite symbol set. It serves as a useful abstraction for reasoning in both cognitive science and in AI, yet the interface between continuous and symbolic processing is often imposed by fiat at the algorithmic level, such as by means of quantization or a softmax sampling step. In this work, we explore how discretization could be implemented in a more neurally plausible manner through the modeling of attractor dynamics that partition the continuous representation space into basins that correspond to sequences of symbols. Building on established work in attractor networks and introducing novel training methods, we show that imposing structure in the symbolic space can produce compositionality in the attractor-supported representation space of rich sensory inputs. Lastly, we argue that our model exhibits the process of an information bottleneck that is thought to play a role in conscious experience, decomposing the rich information of a sensory input into stable components encoding symbolic information.

2023-10-27

NeurIPS.cc/2023/Workshop/InfoCog (présentation orale)

Learning to Scale Logits for Temperature-Conditional GFlowNets

Minsu Kim

Joohwan Ko

Dinghuai Zhang

Ling Pan

Taeyoung Yun

Woo Chang Kim

Jinkyoo Park

GFlowNets are probabilistic models that learn a stochastic policy that sequentially generates compositional structures, such as molecular gr… (voir plus)aphs. They are trained with the objective of sampling such objects with probability proportional to the object's reward. Among GFlowNets, the temperature-conditional GFlowNets represent a family of policies indexed by temperature, and each is associated with the correspondingly tempered reward function. The major benefit of temperature-conditional GFlowNets is the controllability of GFlowNets' exploration and exploitation through adjusting temperature. We propose a \textit{Learning to Scale Logits for temperature-conditional GFlowNets} (LSL-GFN), a novel architectural design that greatly accelerates the training of temperature-conditional GFlowNets. It is based on the idea that previously proposed temperature-conditioning approaches introduced numerical challenges in the training of the deep network because different temperatures may give rise to very different gradient profiles and ideal scales of the policy's logits. We find that the challenge is greatly reduced if a learned function of the temperature is used to scale the policy's logits directly. We empirically show that our strategy dramatically improves the performances of GFlowNets, outperforming other baselines, including reinforcement learning and sampling methods, in terms of discovering diverse modes in multiple biochemical tasks.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Science (poster)

Multi-Fidelity Active Learning with GFlowNets

Alex Hernandez-Garcia

Nikita Saxena

Moksh J. Jain

Cheng-Hao Liu

In the last decades, the capacity to generate large amounts of data in science and engineering applications has been growing steadily. Meanw… (voir plus)hile, the progress in machine learning has turned it into a suitable tool to process and utilise the available data. Nonetheless, many relevant scientific and engineering problems present challenges where current machine learning methods cannot yet efficiently leverage the available data and resources. For example, in scientific discovery, we are often faced with the problem of exploring very large, high-dimensional spaces, where querying a high fidelity, black-box objective function is very expensive. Progress in machine learning methods that can efficiently tackle such problems would help accelerate currently crucial areas such as drug and materials discovery. In this paper, we propose the use of GFlowNets for multi-fidelity active learning, where multiple approximations of the black-box function are available at lower fidelity and cost. GFlowNets are recently proposed methods for amortised probabilistic inference that have proven efficient for exploring large, high-dimensional spaces and can hence be practical in the multi-fidelity setting too. Here, we describe our algorithm for multi-fidelity active learning with GFlowNets and evaluate its performance in both well-studied synthetic tasks and practically relevant applications of molecular discovery. Our results show that multi-fidelity active learning with GFlowNets can efficiently leverage the availability of multiple oracles with different costs and fidelities to accelerate scientific discovery and engineering design.

2023-10-27

NeurIPS.cc/2023/Workshop/ReALML (published)

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

Alvaro Carbonero

Alexandre AGM Duval

Victor Schmidt

Santiago Miret

Alex Hernandez-Garcia

The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (voir plus)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Mat (poster)