Yoshua Bengio

Biography

*For media requests, please write to medias@mila.quebec.

For more information please contact Cassidy MacNeil, Senior Assistant and Operation Lead at cassidy.macneil@mila.quebec.

Yoshua Bengio is recognized worldwide as a leading expert in AI. He is most known for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award, “the Nobel Prize of computing,” with Geoffrey Hinton and Yann LeCun.

Bengio is a full professor at Université de Montréal, and the founder and scientific advisor of Mila – Quebec Artificial Intelligence Institute. He is also a senior fellow at CIFAR and co-directs its Learning in Machines & Brains program, serves as special advisor and founding scientific director of IVADO, and holds a Canada CIFAR AI Chair.

In 2019, Bengio was awarded the prestigious Killam Prize and in 2022, he was the most cited computer scientist in the world by h-index. He is a Fellow of the Royal Society of London, Fellow of the Royal Society of Canada, Knight of the Legion of Honor of France and Officer of the Order of Canada. In 2023, he was appointed to the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Concerned about the social impact of AI, Bengio helped draft the Montréal Declaration for the Responsible Development of Artificial Intelligence and continues to raise awareness about the importance of mitigating the potentially catastrophic risks associated with future AI systems.

Current Students

Jamal Abou Haibeh

Collaborating Alumni - McGill University

Mohammed Abukalam

Collaborating Alumni - Université de Montréal

Berkes Anaïs

Collaborating researcher - Cambridge University

Principal supervisor :

Rim Assouel

PhD - Université de Montréal

Stefan Bauer

Independent visiting researcher

Co-supervisor :

Guillaume Lajoie

Paul Bertin

PhD - Université de Montréal

Joyce Chai

Independent visiting researcher

Principal supervisor :

Siva Reddy

Shahana Chatterjee

Collaborating researcher - N/A

Principal supervisor :

David Rolnick

Xiaoyin Chen

PhD - Université de Montréal

Sanghyeok Choi

Collaborating researcher - KAIST

Collaborating Alumni - Université de Montréal

PhD - Université de Montréal

Collaborating Alumni - Université de Montréal

Co-supervisor :

Loubna Benabbou

Desmond Elliott

Independent visiting researcher

Principal supervisor :

PhD - Université de Montréal

Co-supervisor :

PhD - Université de Montréal

Jean-Pierre Falet

PhD - Université de Montréal

Leo Feng

PhD - Université de Montréal

PhD

PhD - Université de Montréal

Edward Hu

PhD - Université de Montréal

Moksh Jain

PhD - Université de Montréal

PhD - Université de Montréal

Principal supervisor :

Collaborating Alumni - Université de Montréal

Hyeonah Kim

Postdoctorate - Université de Montréal

Principal supervisor :

Alex Hernandez-Garcia

Salem Lahlou

Collaborating Alumni - Université de Montréal

Tabitha Edith Lee

Postdoctorate - Université de Montréal

Principal supervisor :

Collaborating Alumni

Zhen Liu

Collaborating Alumni - Université de Montréal

Principal supervisor :

Liam Paull

Kanika Madan

PhD - Université de Montréal

Nikolay Malkin

Collaborating Alumni - Université de Montréal

Cristian Dragos Manta

PhD - Université de Montréal

Co-supervisor :

Dhanya Sridhar

Sarthak Mittal

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

Postdoctorate - Université de Montréal

Principal supervisor :

Independent visiting researcher - Université de Montréal

Padideh Nouri

PhD - Université de Montréal

Principal supervisor :

Ali Parviz

Collaborating researcher - Ying Wu Coll of Computing

Lena Podina

Collaborating researcher - University of Waterloo

Principal supervisor :

David Rolnick

Nassim Rahaman

Collaborating Alumni - Max-Planck-Institute for Intelligent Systems

Amine RAZIG

Collaborating researcher - Université de Montréal

Co-supervisor :

Loubna Benabbou

Jarrid Rector-Brooks

PhD - Université de Montréal

Danyal REHMAN

Postdoctorate - Université de Montréal

James Requeima

Independent visiting researcher - Université de Montréal

Oli RICHARDSON

Postdoctorate - Université de Montréal

Camille Rochefort-Boulanger

PhD - Université de Montréal

Principal supervisor :

Julie Hussin

Abhik Roychoudhury Roychoudhury

Independent visiting researcher

Principal supervisor :

Siva Reddy

Luca Scimeca

Postdoctorate - Université de Montréal

Collaborating Alumni - Université de Montréal

Marcin Sendera

Collaborating Alumni - Université de Montréal

Divya Sharma

Postdoctorate

Co-supervisor :

Alex Hernandez-Garcia

Mélisande Astrid Crystal Teng

PhD - Université de Montréal

Co-supervisor :

Hugo Larochelle

Ivan Titov

Independent visiting researcher

Principal supervisor :

Siva Reddy

Alex Tong

Collaborating Alumni - Université de Montréal

Postdoctorate - Université de Montréal

Co-supervisor :

PhD - Université de Montréal

Principal supervisor :

Collaborating researcher

Collaborating researcher - Université de Montréal

Nicole Zhang

PhD - McGill University

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

Aaron Courville

Tianyu Zhang

PhD - Université de Montréal

Skipper: Combining Spatial and Temporal Abstraction for Better Generalization

Harry Zhao

Collaborating Alumni - McGill University

Principal supervisor :

Blog Posts

Generic thumbnail for Mila Blog articles.

February 22, 2024

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Scaling in the Service of Reasoning & Model-Based ML

April 4, 2023

Yoshua Bengio

Edward J. Hu

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

March 23, 2022

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

March 15, 2022

Generative Flow Networks

Yoshua Bengio

Laurence Perreault-Levasseur

Publications

Causal Discovery in Astrophysics: Unraveling Supermassive Black Hole and Galaxy Coevolution

Zehao Jin

Mario Pasquato

Benjamin L. Davis

Tristan Deleu

Yu Luo

Changhyun Cho

Pablo Lemos

Xi 熙 Kang 康

Andrea Maccio

Yashar Hezaveh

Correlation does not imply causation, but patterns of statistical association between variables can be exploited to infer a causal structure… (see more) (even with purely observational data) with the burgeoning field of causal discovery. As a purely observational science, astrophysics has much to gain by exploiting these new methods. The supermassive black hole (SMBH)–galaxy interaction has long been constrained by observed scaling relations, which is low-scatter correlations between variables such as SMBH mass and the central velocity dispersion of stars in a host galaxy's bulge. This study, using advanced causal discovery techniques and an up-to-date data set, reveals a causal link between galaxy properties and dynamically measured SMBH masses. We apply a score-based Bayesian framework to compute the exact conditional probabilities of every causal structure that could possibly describe our galaxy sample. With the exact posterior distribution, we determine the most likely causal structures and notice a probable causal reversal when separating galaxies by morphology. In elliptical galaxies, bulge properties (built from major mergers) tend to influence SMBH growth, while, in spiral galaxies, SMBHs are seen to affect host galaxy properties, potentially through feedback in gas-rich environments. For spiral galaxies, SMBHs progressively quench star formation, whereas, in elliptical galaxies, quenching is complete, and the causal connection has reversed. Our findings support theoretical models of hierarchical assembly of galaxies and active galactic nuclei feedback regulating galaxy evolution. Our study suggests the potentiality for further exploration of causal links in astrophysical and cosmological scaling relations, as well as any other observational science.

2024-10-01

ArXiv (preprint)

arxiv.org

MAP: Model Merging with Amortized Pareto Front Using Limited Computation

Li Li

Tianyu Zhang

Zhiqi Bu

Suyuchen Wang

Huan He

Jie Fu

Yonghui Wu

Jiang Bian

Yong Chen

2024-10-01

NeurIPS.cc/2024/Workshop/Federated_Learning (oral)

Amortizing intractable inference in diffusion models for vision, language, and control

Moksh J. Jain

Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors … (see more)in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data,

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Improved off-policy training of diffusion samplers

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We ben… (see more)chmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at [this link](https://github.com/GFNOrg/gfn-diffusion) as a base for future work on diffusion models for amortized inference.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

Aniket Rajiv Didolkar

Anirudh Goyal

Nan Rosemary Ke

Siyuan Guo

Michal Valko

Timothy P Lillicrap

Danilo Jimenez Rezende

Michael Curtis Mozer

Sanjeev Arora

2024-09-25

NeurIPS.cc/2024/Conference (poster)

RGFN: Synthesizable Molecular Generation Using GFlowNets

Michał Koziarski

Andrei Rekesh

Dmytro Shevchuk

Almer M. van der Sloot

Piotr Gainski

Cheng-Hao Liu

Mike Tyers

Robert A. Batey

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Trajectory Flow Matching with Applications to Clinical Time Series Modelling

Xi Zhang

Yuan Pu

Yuki Kawamura

Andrew Loza

Dennis Shung

Alexander Tong

Modeling stochastic and irregularly sampled time series is a challenging problem found in a wide range of applications, especially in medici… (see more)ne. Neural stochastic differential equations (Neural SDEs) are an attractive modeling technique for this problem, which parameterize the drift and diffusion terms of an SDE with neural networks. However, current algorithms for training Neural SDEs require backpropagation through the SDE dynamics, greatly limiting their scalability and stability. To address this, we propose **Trajectory Flow Matching** (TFM), which trains a Neural SDE in a *simulation-free* manner, bypassing backpropagation through the dynamics. TFM leverages the flow matching technique from generative modeling to model time series. In this work we first establish necessary conditions for TFM to learn time series data. Next, we present a reparameterization trick which improves training stability. Finally, we adapt TFM to the clinical time series setting, demonstrating improved performance on three clinical time series datasets both in terms of absolute performance and uncertainty prediction.

2024-09-25

NeurIPS.cc/2024/Conference (spotlight)

A neuronal least-action principle for real-time learning in cortical circuits

Walter Senn

Dominik Dold

Akos F. Kungl

Benjamin Ellenberger

Jakob Jordan

João Sacramento

Mihai A. Petrovici

One of the most fundamental laws of physics is the principle of least action. Motivated by its predictive power, we introduce a neuronal lea… (see more)st-action principle for cortical processing of sensory streams to produce appropriate behavioural outputs in real time. The principle postulates that the voltage dynamics of cortical pyramidal neurons prospectively minimizes the local somato-dendritic mismatch error within individual neurons. For output neurons, the principle implies minimizing an instantaneous behavioural error. For deep network neurons, it implies the prospective firing to overcome integration delays and correct for possible output errors right in time. The neuron-specific errors are extracted in the apical dendrites of pyramidal neurons through a cortical microcircuit that tries to explain away the feedback from the periphery, and correct the trajectory on the fly. Any motor output is in a moving equilibrium with the sensory input and the motor feedback during the ongoing sensory-motor transform. Online synaptic plasticity reduces the somato-dendritic mismatch error within each cortical neuron and performs gradient descent on the output cost at any moment in time. The neuronal least-action principle offers an axiomatic framework to derive local neuronal and synaptic laws for global real-time computation and learning in the brain.

2024-09-23

bioRxiv (preprint)

A neuronal least-action principle for real-time learning in cortical circuits

Walter Senn

Dominik Dold

Akos F. Kungl

Benjamin Ellenberger

Jakob Jordan

João Sacramento

Mihai A. Petrovici

2024-09-23

bioRxiv (preprint)

A neuronal least-action principle for real-time learning in cortical circuits

Walter Senn

Dominik Dold

Akos F. Kungl

Benjamin Ellenberger

Jakob Jordan

João Sacramento

Mihai A. Petrovici

2024-09-23

bioRxiv (preprint)

AI content detection in the emerging information ecosystem: new obligations for media and tech companies

Alistair Knott

Dino Pedreschi

Toshiya Jitsuzumi

Susan Leavy

D. Eyers

Tapabrata Chakraborti

Andrew Trotman

Sundar Sundareswaran

Ricardo Baeza-Yates

Przemyslaw Biecek

Adrian Weller

Paul D. Teal

Subhadip Basu

Mehmet Haklidir

Virginia Morini

Stuart Russell

2024-09-21

Ethics and Information Technology (published)

AI content detection in the emerging information ecosystem: new obligations for media and tech companies

Alistair Knott

Dino Pedreschi

Toshiya Jitsuzumi

Susan Leavy

D. Eyers

Tapabrata Chakraborti

Andrew Trotman

Sundar Sundareswaran

Ricardo Baeza-Yates

Przemyslaw Biecek

Adrian Weller

Paul D. Teal

Subhadip Basu

Mehmet Haklidir

Virginia Morini

Stuart Russell

2024-09-21

Ethics and Information Technology (published)