Yoshua Bengio

Biography

*For media requests, please write to medias@mila.quebec.

For more information please contact Marie-Josée Beauchamp, Administrative Assistant at marie-josee.beauchamp@mila.quebec.

Yoshua Bengio is recognized worldwide as a leading expert in AI. He is most known for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award, “the Nobel Prize of computing,” with Geoffrey Hinton and Yann LeCun.

Bengio is a full professor at Université de Montréal, and the founder and scientific advisor of Mila – Quebec Artificial Intelligence Institute. He is also a senior fellow at CIFAR and co-directs its Learning in Machines & Brains program, serves as special advisor and founding scientific director of IVADO, and holds a Canada CIFAR AI Chair.

In 2019, Bengio was awarded the prestigious Killam Prize and in 2022, he was the most cited computer scientist in the world by h-index. He is a Fellow of the Royal Society of London, Fellow of the Royal Society of Canada, Knight of the Legion of Honor of France and Officer of the Order of Canada. In 2023, he was appointed to the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Concerned about the social impact of AI, Bengio helped draft the Montréal Declaration for the Responsible Development of Artificial Intelligence and continues to raise awareness about the importance of mitigating the potentially catastrophic risks associated with future AI systems.

Current Students

Jamal Abou Haibeh

Collaborating Alumni - McGill University

Mohammed Abukalam

Collaborating Alumni - Université de Montréal

Berkes Anaïs

Collaborating researcher - Cambridge University

Principal supervisor :

Rim Assouel

PhD - Université de Montréal

Stefan Bauer

Independent visiting researcher

Co-supervisor :

Guillaume Lajoie

Paul Bertin

PhD - Université de Montréal

Shahana Chatterjee

Collaborating researcher - N/A

Principal supervisor :

David Rolnick

Xiaoyin Chen

PhD - Université de Montréal

Sanghyeok Choi

Collaborating researcher - KAIST

PhD - Université de Montréal

PhD - Université de Montréal

Research Intern - Université de Montréal

Co-supervisor :

Loubna Benabbou

Eric Elmoznino

PhD - Université de Montréal

Co-supervisor :

PhD - Université de Montréal

Jean-Pierre Falet

PhD - Université de Montréal

Co-supervisor :

Leo Feng

PhD - Université de Montréal

leo.feng@mila.quebec

Ivan Grega

Research Intern - Université de Montréal

PhD

PhD - Université de Montréal

mohsin.hasan@mila.quebec

Edward Hu

PhD - Université de Montréal

Moksh Jain

PhD - Université de Montréal

moksh.jain@mila.quebec

PhD - Université de Montréal

Principal supervisor :

Minsu Kim

Research Intern - Université de Montréal

Hyeonah Kim

Postdoctorate - Université de Montréal

Principal supervisor :

Alex Hernandez

Yaroslav KIVVA

Collaborating researcher - Université de Montréal

Salem Lahlou

Collaborating Alumni - Université de Montréal

Tabitha Edith Lee

Postdoctorate - Université de Montréal

Principal supervisor :

Seanie Lee

Collaborating Alumni - Université de Montréal

Collaborating Alumni

Zhen Liu

Collaborating Alumni - Université de Montréal

Principal supervisor :

Liam Paull

Kanika Madan

PhD - Université de Montréal

Nikolay Malkin

Collaborating Alumni - Université de Montréal

Cristian Dragos Manta

PhD - Université de Montréal

Co-supervisor :

Dhanya Sridhar

Sören Mindermann

Collaborating researcher - Université de Montréal

Sarthak Mittal

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

Postdoctorate - Université de Montréal

Principal supervisor :

Independent visiting researcher - Université de Montréal

Padideh Nouri

PhD - Université de Montréal

Principal supervisor :

Ali Parviz

Collaborating researcher - Ying Wu Coll of Computing

Lena Podina

PhD - University of Waterloo

Principal supervisor :

Collaborating Alumni - Max-Planck-Institute for Intelligent Systems

Jarrid Rector-Brooks

PhD - Université de Montréal

Danyal REHMAN

Postdoctorate - Université de Montréal

James Requeima

Independent visiting researcher - Université de Montréal

Oli RICHARDSON

Postdoctorate - Université de Montréal

Camille Rochefort-Boulanger

PhD - Université de Montréal

Principal supervisor :

Julie Hussin

Victor Schmidt

Collaborating Alumni - Université de Montréal

Postdoctorate - Université de Montréal

Master's Research - Université de Montréal

Marcin Sendera

Collaborating Alumni - Université de Montréal

Vedant Shah

Master's Research - Université de Montréal

Postdoctorate

Marco Stock

Independent visiting researcher - Technical University of Munich

marco.stock@tum.de

Mélisande Astrid Crystal Teng

PhD - Université de Montréal

Co-supervisor :

Hugo Larochelle

alexander.tong@mila.quebec

Alex Tong

Postdoctorate - Université de Montréal

Postdoctorate - Université de Montréal

Co-supervisor :

PhD - Université de Montréal

Principal supervisor :

Collaborating researcher - Université de Montréal

Omar G. Younis

Collaborating researcher

Collaborating researcher - KAIST

Nicole Zhang

PhD - McGill University

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Skipper: Combining Spatial and Temporal Abstraction for Better Generalization

Harry Zhao

PhD - McGill University

Principal supervisor :

Blog Posts

Generic thumbnail for Mila Blog articles.

February 22, 2024

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Scaling in the Service of Reasoning & Model-Based ML

April 4, 2023

Yoshua Bengio

Edward J. Hu

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

March 23, 2022

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

March 15, 2022

Generative Flow Networks

Yoshua Bengio

Publications

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Hossein Hajimirsadeghi

2024-10-02

ArXiv (preprint)

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Hossein Hajimirsadeghi

The introduction of Transformers in 2017 reshaped the landscape of deep learning. Originally proposed for sequence modelling, Transformers h… (see more)ave since achieved widespread success across various domains. However, the scalability limitations of Transformers - particularly with respect to sequence length - have sparked renewed interest in novel recurrent models that are parallelizable during training, offer comparable performance, and scale more effectively. In this work, we revisit sequence modelling from a historical perspective, focusing on Recurrent Neural Networks (RNNs), which dominated the field for two decades before the rise of Transformers. Specifically, we examine LSTMs (1997) and GRUs (2014). We demonstrate that by simplifying these models, we can derive minimal versions (minLSTMs and minGRUs) that (1) use fewer parameters than their traditional counterparts, (2) are fully parallelizable during training, and (3) achieve surprisingly competitive performance on a range of tasks, rivalling recent models including Transformers.

2024-10-02

ArXiv (preprint)

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Hossein Hajimirsadeghi

2024-10-02

ArXiv (preprint)

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Hossein Hajimirsadeghi

2024-10-02

ArXiv (preprint)

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Hossein Hajimirsadeghi

2024-10-02

ArXiv (preprint)

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Hossein Hajimirsadeghi

2024-10-02

ArXiv (preprint)

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Hossein Hajimirsadeghi

2024-10-02

ArXiv (preprint)

Laurence Perreault-Levasseur

Causal Discovery in Astrophysics: Unraveling Supermassive Black Hole and Galaxy Coevolution

Zehao Jin

Mario Pasquato

Benjamin L. Davis

Tristan Deleu

Yu Luo

Changhyun Cho

Pablo Lemos

Xi 熙 Kang 康

Andrea Maccio

Yashar Hezaveh

Correlation does not imply causation, but patterns of statistical association between variables can be exploited to infer a causal structure… (see more) (even with purely observational data) with the burgeoning field of causal discovery. As a purely observational science, astrophysics has much to gain by exploiting these new methods. The supermassive black hole (SMBH)–galaxy interaction has long been constrained by observed scaling relations, which is low-scatter correlations between variables such as SMBH mass and the central velocity dispersion of stars in a host galaxy's bulge. This study, using advanced causal discovery techniques and an up-to-date data set, reveals a causal link between galaxy properties and dynamically measured SMBH masses. We apply a score-based Bayesian framework to compute the exact conditional probabilities of every causal structure that could possibly describe our galaxy sample. With the exact posterior distribution, we determine the most likely causal structures and notice a probable causal reversal when separating galaxies by morphology. In elliptical galaxies, bulge properties (built from major mergers) tend to influence SMBH growth, while, in spiral galaxies, SMBHs are seen to affect host galaxy properties, potentially through feedback in gas-rich environments. For spiral galaxies, SMBHs progressively quench star formation, whereas, in elliptical galaxies, quenching is complete, and the causal connection has reversed. Our findings support theoretical models of hierarchical assembly of galaxies and active galactic nuclei feedback regulating galaxy evolution. Our study suggests the potentiality for further exploration of causal links in astrophysical and cosmological scaling relations, as well as any other observational science.

2024-10-01

ArXiv (preprint)

Laurence Perreault-Levasseur

Causal Discovery in Astrophysics: Unraveling Supermassive Black Hole and Galaxy Coevolution

Zehao Jin

Mario Pasquato

Benjamin L. Davis

Tristan Deleu

Yu Luo

Changhyun Cho

Pablo Lemos

Xi Kang

Andrea Maccio

Yashar Hezaveh

Correlation does not imply causation, but patterns of statistical association between variables can be exploited to infer a causal structure… (see more) (even with purely observational data) with the burgeoning field of causal discovery. As a purely observational science, astrophysics has much to gain by exploiting these new methods. The supermassive black hole (SMBH)--galaxy interaction has long been constrained by observed scaling relations, that is low-scatter correlations between variables such as SMBH mass and the central velocity dispersion of stars in a host galaxy's bulge. This study, using advanced causal discovery techniques and an up-to-date dataset, reveals a causal link between galaxy properties and dynamically-measured SMBH masses. We apply a score-based Bayesian framework to compute the exact conditional probabilities of every causal structure that could possibly describe our galaxy sample. With the exact posterior distribution, we determine the most likely causal structures and notice a probable causal reversal when separating galaxies by morphology. In elliptical galaxies, bulge properties (built from major mergers) tend to influence SMBH growth, while in spiral galaxies, SMBHs are seen to affect host galaxy properties, potentially through feedback in gas-rich environments. For spiral galaxies, SMBHs progressively quench star formation, whereas in elliptical galaxies, quenching is complete, and the causal connection has reversed. Our findings support theoretical models of hierarchical assembly of galaxies and active galactic nuclei feedback regulating galaxy evolution. Our study suggests the potentiality for further exploration of causal links in astrophysical and cosmological scaling relations, as well as any other observational science.

2024-10-01

ArXiv (preprint)

Laurence Perreault-Levasseur

Causal Discovery in Astrophysics: Unraveling Supermassive Black Hole and Galaxy Coevolution

Zehao Jin

Mario Pasquato

Benjamin L. Davis

Tristan Deleu

Yu Luo

Changhyun Cho

Pablo Lemos

Xi 熙 Kang 康

Andrea Maccio

Yashar Hezaveh

2024-10-01

ArXiv (preprint)

MAP: Model Merging with Amortized Pareto Front Using Limited Computation

Lu Li

Tianyu Zhang

Zhiqi Bu

Suyuchen Wang

Huan He

Jie Fu

Yonghui Wu

Jiang Bian

Yong Chen

2024-10-01

NeurIPS.cc/2024/Workshop/Federated_Learning (oral)

openreview.net

Amortizing intractable inference in diffusion models for vision, language, and control

Siddarth Venkatraman

Moksh J. Jain

Luca Scimeca

Minsu Kim

Marcin Sendera

Mohsin Hasan

Luke Rowe

Sarthak Mittal

Pablo Lemos

Emmanuel Bengio

Alexandre Adam

Jarrid Rector-Brooks

Glen Berseth

Nikolay Malkin

Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors … (see more)in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data,

2024-09-25

NeurIPS.cc/2024/Conference (poster)