Yoshua Bengio

aasheesh.singh@mila.quebec

Biography

*For media requests, please write to medias@mila.quebec.

For more information please contact Julie Mongeau, executive assistant at julie.mongeau@mila.quebec.

Yoshua Bengio is recognized worldwide as a leading expert in AI. He is most known for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award, “the Nobel Prize of computing,” with Geoffrey Hinton and Yann LeCun.

Bengio is a full professor at Université de Montréal, and the founder and scientific director of Mila – Quebec Artificial Intelligence Institute. He is also a senior fellow at CIFAR and co-directs its Learning in Machines & Brains program, serves as scientific director of IVADO, and holds a Canada CIFAR AI Chair.

In 2019, Bengio was awarded the prestigious Killam Prize and in 2022, he was the most cited computer scientist in the world by h-index. He is a Fellow of the Royal Society of London, Fellow of the Royal Society of Canada, Knight of the Legion of Honor of France and Officer of the Order of Canada. In 2023, he was appointed to the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Concerned about the social impact of AI, Bengio helped draft the Montréal Declaration for the Responsible Development of Artificial Intelligence and continues to raise awareness about the importance of mitigating the potentially catastrophic risks associated with future AI systems.

Current Students

Singh Aasheesh

Professional Master's

Research Intern - McGill University

jamal.abouhaibeh@mila.quebec

mohammed.abukalam@mila.quebec

Mohammed Abukalam

Research Intern - Université de Montréal

dan.assouline@mila.quebec

Rim Assouel

PhD - Université de Montréal

assouelr@mila.quebec

Dan Assouline

Collaborating Alumni

ayoub.atanane@mila.quebec

Ayoub Atanane

Research Intern - Université du Québec à Rimouski

Aayush Bajaj

Professional Master's - Université de Montréal

aayush.bajaj@mila.quebec

Stefan Bauer

Independent visiting researcher

Co-supervisor :

Guillaume Lajoie

stefan.bauer@mila.quebec

loubna.benabbou@mila.quebec

Loubna Benabbou

Independent visiting researcher - UQAR

ghait.boukachab@mila.quebec

Paul Bertin

PhD - Université de Montréal

bertinpa@mila.quebec

Ghait Boukachab

Research Intern - UQAR

oussama.boussif@mila.quebec

Oussama Boussif

PhD - Université de Montréal

Independent visiting researcher - MIT

andres.campero@mila.quebec

subhrajyoti.dasgupta@mila.quebec

Xiaoyin Chen

PhD - Université de Montréal

xiaoyin.chen@mila.quebec

Chen Chen

Postdoctorate - Université de Montréal

Co-supervisor :

Blake Richards

chen.sun@mila.quebec

Aman Dalmia

Professional Master's - Université de Montréal

aman.dalmia@mila.quebec

Subhrajyoti Dasgupta

Professional Master's - Université de Montréal

pierre-paul.de-breuck@mila.quebec

Pierre-Paul De Breuck

Collaborating Alumni - Université de Montréal

PhD - Université de Montréal

PhD - Université de Montréal

aniket.didolkar@mila.quebec

Collaborating researcher - Université Paris-Saclay

Principal supervisor :

alexandre.duval@mila.quebec

eric.elmoznino@mila.quebec

Eric Elmoznino

PhD - Université de Montréal

Co-supervisor :

Guillaume Lajoie

PhD - Université de Montréal

akram.erraqabi@mila.quebec

Katie Everett

PhD - Massachusetts Institute of Technology

katie-elizabeth.everett@mila.quebec

Léna Nehale Ezzine

PhD - Université de Montréal

lena-nehale.ezzine@mila.quebec

Jean-pierre Falet

PhD - Université de Montréal

Co-supervisor :

Guillaume Lajoie

jean-pierre.falet@mila.quebec

damiano.fornasiere@mila.quebec

Leo Feng

PhD - Université de Montréal

PhD - Barcelona University

Jerome Francis

Professional Master's - Université de Montréal

jerome.francis@mila.quebec

piotr.gainski@mila.quebec

Piotr Gainski

Research Intern - Université de Montréal

ahmad.ghawanmeh@mila.quebec

Ahmad Ghawanmeh

Professional Master's - Université de Montréal

Clemence Granade

Professional Master's - Université de Montréal

clemence.granade@mila.quebec

pietro.greiner@mila.quebec

Pietro Greiner

Collaborating researcher

Mohsin Hasan

PhD - Université de Montréal

mohsin.hasan@mila.quebec

Alex Hernandez-Garcia

Postdoctorate - Université de Montréal

Co-supervisor :

Leon Hetzel

Independent visiting researcher - Technical University Munich (TUM)

leon.hetzel@mila.quebec

thomas.jiralerspong@mila.quebec

Edward Hu

PhD - Université de Montréal

edward.hu@mila.quebec

Moksh Jain

PhD - Université de Montréal

moksh.jain@mila.quebec

Research Intern - Université de Montréal

jiangyan.ma@mila.quebec

Master's Research - Université de Montréal

Co-supervisor :

Doina Precup

Research Intern - Université de Montréal

younesse.kaddar@mila.quebec

michal.koziarski@mila.quebec

Minsu Kim

Collaborating researcher - Université de Montréal

minsu.kim@mila.quebec

PhD - Université de Montréal

korablym@mila.quebec

Michał Koziarski

Postdoctorate - Université de Montréal

Salem Lahlou

PhD - Université de Montréal

lahlosal@mila.quebec

Hae-Beom Lee

Collaborating Alumni

hae-beom.lee@mila.quebec

Seanie Lee

Research Intern - Université de Montréal

seanie.lee@mila.quebec

matthew.macdermott@mila.quebec

Mingze Li

Professional Master's - Université de Montréal

mingze2.li@mila.quebec

Collaborating Alumni

Zhen Liu

PhD - Université de Montréal

Principal supervisor :

Liam Paull

liuzhen@mila.quebec

Stephen Lu

Research Intern - McGill University

stephen.lu@mila.quebec

Research Intern - Imperial College London

PhD - Université de Montréal

madankan@mila.quebec

Mohammed Mahfoud

Research Intern - Université de Montréal

mohammed.mahfoud@mila.quebec

Nikolay Malkin

Collaborating Alumni - Université de Montréal

nikolay.malkin@mila.quebec

DESS - Université de Montréal

loic.mandine@mila.quebec

Cristian Dragos Manta

PhD - Université de Montréal

Co-supervisor :

Dhanya Sridhar

cristian-dragos.manta@mila.quebec

stefano.massaroli@mila.quebec

Stefano Massaroli

Postdoctorate - Université de Montréal

Cristian Meo

Collaborating Alumni

cristian.meo@mila.quebec

soren.mindermann@mila.quebec

Sören Mindermann

Collaborating researcher - Université de Montréal

hussein-mohamu.jama@mila.quebec

Sarthak Mittal

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

Mirco Ravanelli

Professional Master's - Université de Montréal

priya.nama@mila.quebec

Phong Nguyen

Independent visiting researcher - Université de Montréal

nguyenph@mila.quebec

Ling Pan

Independent visiting researcher - Hong Kong University of Science and Technology (HKUST)

ling.pan@mila.quebec

Ali Parviz

Collaborating researcher - Ying Wu Coll of Computing

ali.parviz@mila.quebec

yashaswi.pupneja@mila.quebec

Yashaswi Pupneja

Professional Master's - Université de Montréal

vincent.quirion@mila.quebec

Vincent Quirion

Undergraduate - Université de Montréal

Nassim Rahaman

PhD - Max-Planck-Institute for Intelligent Systems

rahamann@mila.quebec

Param Raval

Professional Master's - Université de Montréal

param.raval@mila.quebec

Jarrid Rector-Brooks

PhD - Université de Montréal

Co-supervisor :

Sarath Chandar Anbil Parthipan

jarrid.rector-brooks@mila.quebec

James Requeima

Independent visiting researcher - Université de Montréal

james.requeima@mila.quebec

Jessie Richter-Powell

Independent visiting researcher - Université de Montréal

jack.richter-powell@mila.quebec

Camille Rochefort-Boulanger

PhD - Université de Montréal

Principal supervisor :

Julie Hussin

rochefoc@mila.quebec

Theo Saulus

Collaborating researcher

Principal supervisor :

dragos.secrieru@mila.quebec

theo.saulus@mila.quebec

Victor Schmidt

PhD - Université de Montréal

Postdoctorate - Université de Montréal

luca.scimeca@mila.quebec

Master's Research - Université de Montréal

Marcin Sendera

Research Intern - Université de Montréal

marcin.sendera@mila.quebec

Vedant Shah

Master's Research - Université de Montréal

vedant.shah@mila.quebec

Zibo Shang

Professional Master's - Université de Montréal

zibo.shang@mila.quebec

Divya Sharma

Collaborating Alumni

divya.sharma@mila.quebec

Marco Stock

Independent visiting researcher - Technical University of Munich

marco.stock@mila.quebec

Mélisande Astrid Crystal Teng

Anja Surina

PhD - École Polytechnique Montréal Fédérale de Lausanne

anja.surina@mila.quebec

PhD - Université de Montréal

Co-supervisor :

Collaborating researcher

Principal supervisor :

basile.terver@mila.quebec

Alexander Tong

Postdoctorate - Université de Montréal

alexander.tong@mila.quebec

Prudencio Tossou

Collaborating researcher - Valence

Principal supervisor :

Dominique Beaini

prudencio.tossou@mila.quebec

Donna Vakalis

Postdoctorate - Université de Montréal

Co-supervisor :

donna.vakalis@mila.quebec

Todosijevic Viktor Todosijevic

Collaborating researcher - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)

Principal supervisor :

viktor.todosijevic@mila.quebec

alexandra.volokhova@mila.quebec

Sasha Volokhova

PhD - Université de Montréal

Yizhao Wang

Professional Master's - Université de Montréal

yizhao.wang@mila.quebec

Zichao Yan

Collaborating Alumni - Université de Montréal

yanzicha@mila.quebec

Elmimouni Zakaria

Research Intern - Université de Montréal

zakarya.elmimouni@mila.quebec

dinghuai.zhang@mila.quebec

Dinghuai Zhang

PhD - Université de Montréal

Principal supervisor :

Aaron Courville

Ruixiang Zhang

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

tianyu.zhang@mila.quebec

PhD - McGill University

Principal supervisor :

Mathieu Blanchette

xi.zhang@mila.quebec

Skipper: Combining Spatial and Temporal Abstraction for Better Generalization

Harry Zhao

PhD - McGill University

Principal supervisor :

Blog Posts

Generic thumbnail for Mila Blog articles.

February 22, 2024

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Scaling in the Service of Reasoning & Model-Based ML

April 4, 2023

Yoshua Bengio

Edward J. Hu

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

March 23, 2022

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

March 15, 2022

Generative Flow Networks

Yoshua Bengio

Publications

Managing AI Risks in an Era of Rapid Progress

Geoffrey Hinton

Andrew Yao

Dawn Song

Pieter Abbeel

Yuval Noah Harari

Trevor Darrell

Ya-Qin Zhang

Lan Xue

Shai Shalev-Shwartz

Gillian K. Hadfield

Jeff Clune

Tegan Maharaj

Frank Hutter

Atilim Güneş Baydin

Sheila McIlraith

Qiqi Gao

Ashwin Acharya

David Scott Krueger

Anca Dragan … (see 5 more)

Philip Torr

Stuart Russell

Daniel Kahneman

Jan Brauner

Sören Mindermann

2024-05-24

Science (published)

Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

Tara Akhound-Sadegh

Jarrid Rector-Brooks

Joey Bose

Sarthak Mittal

Pablo Lemos

Cheng-Hao Liu

Marcin Sendera

Siamak Ravanbakhsh

Gauthier Gidel

Nikolay Malkin

Alexander Tong

Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-… (see more)body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient---and no data samples---to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is *simulation-free*, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant

2024-05-01

ICML.cc/2024/Conference (poster)

Learning to Scale Logits for Temperature-Conditional GFlowNets

Minsu Kim

Joohwan Ko

Dinghuai Zhang

Ling Pan

Taeyoung Yun

Woo Chang Kim

Jinkyoo Park

Emmanuel Bengio

GFlowNets are probabilistic models that sequentially generate compositional structures through a stochastic policy. Among GFlowNets, tempera… (see more)ture-conditional GFlowNets can introduce temperature-based controllability for exploration and exploitation. We propose \textit{Logit-scaling GFlowNets} (Logit-GFN), a novel architectural design that greatly accelerates the training of temperature-conditional GFlowNets. It is based on the idea that previously proposed approaches introduced numerical challenges in the deep network training, since different temperatures may give rise to very different gradient profiles as well as magnitudes of the policy's logits. We find that the challenge is greatly reduced if a learned function of the temperature is used to scale the policy's logits directly. Also, using Logit-GFN, GFlowNets can be improved by having better generalization capabilities in offline learning and mode discovery capabilities in online learning, which is empirically verified in various biological and chemical tasks. Our code is available at https://github.com/dbsxodud-11/logit-gfn

2024-05-01

ICML.cc/2024/Conference (poster)

Memory Efficient Neural Processes via Constant Memory Attention Block

Leo Feng

Frederick Tung

Hossein Hajimirsadeghi

Mohamed Osama Ahmed

Neural Processes (NPs) are popular meta-learning methods for efficiently modelling predictive uncertainty. Recent state-of-the-art methods, … (see more)however, leverage expensive attention mechanisms, limiting their applications, particularly in low-resource settings. In this work, we propose Constant Memory Attention Block (CMAB), a novel general-purpose attention block that (1) is permutation invariant, (2) computes its output in constant memory, and (3) performs updates in constant computation. Building on CMAB, we propose Constant Memory Attentive Neural Processes (CMANPs), an NP variant which only requires \textbf{constant} memory. Empirically, we show CMANPs achieve state-of-the-art results on popular NP benchmarks (meta-regression and image completion) while being significantly more memory efficient than prior methods.

2024-05-01

ICML.cc/2024/Conference (poster)

Discrete Probabilistic Inference as Control in Multi-path Environments

Tristan Deleu

Padideh Nouri

Nikolay Malkin

Doina Precup

We consider the problem of sampling from a discrete and structured distribution as a sequential decision problem, where the objective is to … (see more)find a stochastic policy such that objects are sampled at the end of this sequential process proportionally to some predefined reward. While we could use maximum entropy Reinforcement Learning (MaxEnt RL) to solve this problem for some distributions, it has been shown that in general, the distribution over states induced by the optimal policy may be biased in cases where there are multiple ways to generate the same object. To address this issue, Generative Flow Networks (GFlowNets) learn a stochastic policy that samples objects proportionally to their reward by approximately enforcing a conservation of flows across the whole Markov Decision Process (MDP). In this paper, we extend recent methods correcting the reward in order to guarantee that the marginal distribution induced by the optimal MaxEnt RL policy is proportional to the original reward, regardless of the structure of the underlying MDP. We also prove that some flow-matching objectives found in the GFlowNet literature are in fact equivalent to well-established MaxEnt RL algorithms with a corrected reward. Finally, we study empirically the performance of multiple MaxEnt RL and GFlowNet algorithms on multiple problems involving sampling from discrete distributions.

2024-04-26

auai.org/UAI/2024/Conference (poster)

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Usman Anwar

Abulhair Saparov

Javier Rando

Daniel Paleka

Miles Turpin

Peter Hase

Ekdeep Singh Lubana

Erik Jenner

Stephen Casper

Oliver Sourbut

Benjamin L. Edelman

Zhaowei Zhang

Mario Gunther

Anton Korinek

Jose Hernandez-Orallo

Lewis Hammond

Eric J Bigelow

Alexander Pan

Lauro Langosco

Tomasz Korbak … (see 18 more)

Heidi Zhang

Ruiqi Zhong

Sean 'o H'eigeartaigh

Gabriel Recchia

Giulio Corsi

Alan Chan

Markus Anderljung

Lilian Edwards

Danqi Chen

Samuel Albanie

Tegan Maharaj

Jakob Nicolaus Foerster

Florian Tramèr

He He

Atoosa Kasirzadeh

Yejin Choi

David Scott Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are o… (see more)rganized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose

2024-04-15

ArXiv (preprint)

Government Interventions to Avert Future Catastrophic AI Risks

2024-04-15

Special Issue 5: Grappling With the Generative AI Revolution (published)

Regulating advanced artificial agents

Michael K. Cohen

Noam Kolt

Gillian K. Hadfield

Stuart Russell

2024-04-05

Science (published)

Language Models Can Reduce Asymmetry in Information Markets

Nasim Rahaman

Martin Weiss

Manuel Wüthrich

Erran L. Li

Chris Pal

Bernhard Schölkopf

2024-03-21

ArXiv (preprint)

Ant Colony Sampling with GFlowNets for Combinatorial Optimization

Minsu Kim

Sanghyeok Choi

Jiwoo Son

Hyeon-Seob Kim

Jinkyoo Park

2024-03-11

ArXiv (preprint)

Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport

Alexander Tong

Nikolay Malkin

Guillaume Huguet

Yanlei Zhang

Jarrid Rector-Brooks

Kilian FATRAS

Guy Wolf

Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (see more)mulation-based maximum likelihood training. We introduce the generalized \textit{conditional flow matching} (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, OT-CFM is the first method to compute dynamic OT in a simulation-free way. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schrödinger bridge inference.

2024-03-11

TMLR (accepted)

Integrating Generative and Experimental Platforms or Biomolecular Design

Cheng-Hao Liu

Jarrid Rector-Brooks

Jason Yim

Soojung Yang

Sidney Lisanza

Francesca-Zhoufan Li

Pranam Chatterjee

Tommi Jaakkola

Regina Barzilay

David Baker

Frances H. Arnold

2024-03-08

ICLR.cc/2024/Workshop_Proposals (published)