Portrait de Yoshua Bengio

Yoshua Bengio

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur titulaire, Université de Montréal, Département d'informatique et de recherche opérationnelle
Directeur scientifique, Équipe de direction
Observateur, Conseil d'administration, Mila


*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Julie Mongeau, adjointe de direction à julie.mongeau@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et directeur scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de directeur scientifique d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Stagiaire de recherche - Université du Québec à Rimouski
Maîtrise professionnelle - UdeM
Visiteur de recherche indépendant
Co-superviseur⋅e :
Visiteur de recherche indépendant - UQAR
Stagiaire de recherche - UQAR
Visiteur de recherche indépendant - MIT
Postdoctorat - UdeM
Co-superviseur⋅e :
Maîtrise professionnelle - UdeM
Collaborateur·rice de recherche - Université Paris-Saclay
Superviseur⋅e principal⋅e :
Doctorat - Massachusetts Institute of Technology
Doctorat - Barcelona University
Maîtrise professionnelle - UdeM
Maîtrise professionnelle - UdeM
Collaborateur·rice de recherche
Visiteur de recherche indépendant - Technical University Munich (TUM)
Collaborateur·rice de recherche - UdeM
Collaborateur·rice alumni
Maîtrise professionnelle - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Stagiaire de recherche - Imperial College London
Stagiaire de recherche - UdeM
Collaborateur·rice de recherche - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Maîtrise professionnelle - UdeM
Visiteur de recherche indépendant - UdeM
Visiteur de recherche indépendant - Hong Kong University of Science and Technology (HKUST)
Collaborateur·rice de recherche - Ying Wu Coll of Computing
Maîtrise professionnelle - UdeM
Doctorat - Max-Planck-Institute for Intelligent Systems
Maîtrise professionnelle - UdeM
Visiteur de recherche indépendant - UdeM
Visiteur de recherche indépendant - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche
Superviseur⋅e principal⋅e :
Maîtrise recherche - UdeM
Maîtrise professionnelle - UdeM
Visiteur de recherche indépendant - Technical University of Munich
Doctorat - École Polytechnique Fédérale de Lausanne
Collaborateur·rice de recherche
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - Valence
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)
Superviseur⋅e principal⋅e :
Maîtrise professionnelle - UdeM
Collaborateur·rice alumni - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :


Managing AI Risks in an Era of Rapid Progress
Geoffrey Hinton
Andrew Yao
Dawn Song
Pieter Abbeel
Yuval Noah Harari
Trevor Darrell
Ya-Qin Zhang
Lan Xue
Shai Shalev-Shwartz
Gillian K. Hadfield
Jeff Clune
Frank Hutter
Atilim Güneş Baydin
Sheila McIlraith
Qiqi Gao
Ashwin Acharya
Anca Dragan … (voir 5 de plus)
Philip Torr
Stuart Russell
Daniel Kahneman
Jan Brauner
Sören Mindermann
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Tara Akhound-Sadegh
Jarrid Rector-Brooks
Joey Bose
Sarthak Mittal
Pablo Lemos
Cheng-Hao Liu
Marcin Sendera
Nikolay Malkin
Alexander Tong
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-… (voir plus)body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient---and no data samples---to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is *simulation-free*, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant
Learning to Scale Logits for Temperature-Conditional GFlowNets
Minsu Kim
Joohwan Ko
Dinghuai Zhang
Ling Pan
Taeyoung Yun
Woo Chang Kim
Jinkyoo Park
Emmanuel Bengio
GFlowNets are probabilistic models that sequentially generate compositional structures through a stochastic policy. Among GFlowNets, tempera… (voir plus)ture-conditional GFlowNets can introduce temperature-based controllability for exploration and exploitation. We propose \textit{Logit-scaling GFlowNets} (Logit-GFN), a novel architectural design that greatly accelerates the training of temperature-conditional GFlowNets. It is based on the idea that previously proposed approaches introduced numerical challenges in the deep network training, since different temperatures may give rise to very different gradient profiles as well as magnitudes of the policy's logits. We find that the challenge is greatly reduced if a learned function of the temperature is used to scale the policy's logits directly. Also, using Logit-GFN, GFlowNets can be improved by having better generalization capabilities in offline learning and mode discovery capabilities in online learning, which is empirically verified in various biological and chemical tasks. Our code is available at https://github.com/dbsxodud-11/logit-gfn
Memory Efficient Neural Processes via Constant Memory Attention Block
Leo Feng
Frederick Tung
Hossein Hajimirsadeghi
Mohamed Osama Ahmed
Neural Processes (NPs) are popular meta-learning methods for efficiently modelling predictive uncertainty. Recent state-of-the-art methods, … (voir plus)however, leverage expensive attention mechanisms, limiting their applications, particularly in low-resource settings. In this work, we propose Constant Memory Attention Block (CMAB), a novel general-purpose attention block that (1) is permutation invariant, (2) computes its output in constant memory, and (3) performs updates in constant computation. Building on CMAB, we propose Constant Memory Attentive Neural Processes (CMANPs), an NP variant which only requires \textbf{constant} memory. Empirically, we show CMANPs achieve state-of-the-art results on popular NP benchmarks (meta-regression and image completion) while being significantly more memory efficient than prior methods.
Discrete Probabilistic Inference as Control in Multi-path Environments
Tristan Deleu
Padideh Nouri
Nikolay Malkin
We consider the problem of sampling from a discrete and structured distribution as a sequential decision problem, where the objective is to … (voir plus)find a stochastic policy such that objects are sampled at the end of this sequential process proportionally to some predefined reward. While we could use maximum entropy Reinforcement Learning (MaxEnt RL) to solve this problem for some distributions, it has been shown that in general, the distribution over states induced by the optimal policy may be biased in cases where there are multiple ways to generate the same object. To address this issue, Generative Flow Networks (GFlowNets) learn a stochastic policy that samples objects proportionally to their reward by approximately enforcing a conservation of flows across the whole Markov Decision Process (MDP). In this paper, we extend recent methods correcting the reward in order to guarantee that the marginal distribution induced by the optimal MaxEnt RL policy is proportional to the original reward, regardless of the structure of the underlying MDP. We also prove that some flow-matching objectives found in the GFlowNet literature are in fact equivalent to well-established MaxEnt RL algorithms with a corrected reward. Finally, we study empirically the performance of multiple MaxEnt RL and GFlowNet algorithms on multiple problems involving sampling from discrete distributions.
Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Usman Anwar
Abulhair Saparov
Javier Rando
Daniel Paleka
Miles Turpin
Peter Hase
Ekdeep Singh Lubana
Erik Jenner
Stephen Casper
Oliver Sourbut
Benjamin L. Edelman
Zhaowei Zhang
Mario Gunther
Anton Korinek
Jose Hernandez-Orallo
Lewis Hammond
Eric J Bigelow
Alexander Pan
Lauro Langosco
Tomasz Korbak … (voir 18 de plus)
Heidi Zhang
Ruiqi Zhong
Sean 'o H'eigeartaigh
Gabriel Recchia
Giulio Corsi
Alan Chan
Markus Anderljung
Lilian Edwards
Danqi Chen
Samuel Albanie
Jakob Nicolaus Foerster
Florian Tramèr
He He
Atoosa Kasirzadeh
Yejin Choi
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are o… (voir plus)rganized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose
Government Interventions to Avert Future Catastrophic AI Risks
Regulating advanced artificial agents
Michael K. Cohen
Noam Kolt
Gillian K. Hadfield
Stuart Russell
Language Models Can Reduce Asymmetry in Information Markets
Nasim Rahaman
Martin Weiss
Manuel Wüthrich
Erran L. Li
Bernhard Schölkopf
Ant Colony Sampling with GFlowNets for Combinatorial Optimization
Minsu Kim
Sanghyeok Choi
Jiwoo Son
Hyeon-Seob Kim
Jinkyoo Park
Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport
Alexander Tong
Nikolay Malkin
Guillaume Huguet
Yanlei Zhang
Jarrid Rector-Brooks
Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (voir plus)mulation-based maximum likelihood training. We introduce the generalized \textit{conditional flow matching} (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, OT-CFM is the first method to compute dynamic OT in a simulation-free way. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schrödinger bridge inference.
Integrating Generative and Experimental Platforms or Biomolecular Design
Cheng-Hao Liu
Jarrid Rector-Brooks
Jason Yim
Soojung Yang
Sidney Lisanza
Francesca-Zhoufan Li
Pranam Chatterjee
Tommi Jaakkola
Regina Barzilay
David Baker
Frances H. Arnold