Portrait of Yoshua Bengio

Yoshua Bengio

Core Academic Member
Canada CIFAR AI Chair
Full Professor, Université de Montréal, Department of Computer Science and Operations Research Department
Scientific Director, Leadership Team
Observer, Board of Directors, Mila

Biography

*For media requests, please write to medias@mila.quebec.

For more information please contact Julie Mongeau, executive assistant at julie.mongeau@mila.quebec.

Yoshua Bengio is recognized worldwide as a leading expert in AI. He is most known for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award, “the Nobel Prize of computing,” with Geoffrey Hinton and Yann LeCun.

Bengio is a full professor at Université de Montréal, and the founder and scientific director of Mila – Quebec Artificial Intelligence Institute. He is also a senior fellow at CIFAR and co-directs its Learning in Machines & Brains program, serves as scientific director of IVADO, and holds a Canada CIFAR AI Chair.

In 2019, Bengio was awarded the prestigious Killam Prize and in 2022, he was the most cited computer scientist in the world by h-index. He is a Fellow of the Royal Society of London, Fellow of the Royal Society of Canada, Knight of the Legion of Honor of France and Officer of the Order of Canada. In 2023, he was appointed to the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Concerned about the social impact of AI, Bengio helped draft the Montréal Declaration for the Responsible Development of Artificial Intelligence and continues to raise awareness about the importance of mitigating the potentially catastrophic risks associated with future AI systems.

Current Students

Research Intern - Université de Montréal
PhD - Université de Montréal
Research Intern - Université du Québec à Rimouski
Professional Master's - Université de Montréal
Independent visiting researcher
Co-supervisor :
Independent visiting researcher - UQAR
PhD - Université de Montréal
Independent visiting researcher - MIT
PhD - Université de Montréal
Postdoctorate - Université de Montréal
Co-supervisor :
Professional Master's - Université de Montréal
Professional Master's - Université de Montréal
Collaborating Alumni - Université de Montréal
Collaborating researcher - Université Paris-Saclay
Principal supervisor :
PhD - Université de Montréal
PhD - Massachusetts Institute of Technology
PhD - Université de Montréal
PhD - Université de Montréal
Professional Master's - Université de Montréal
Professional Master's - Université de Montréal
Professional Master's - Université de Montréal
Collaborating researcher
Postdoctorate - Université de Montréal
Co-supervisor :
Independent visiting researcher - Technical University Munich (TUM)
PhD - Université de Montréal
Research Intern - Université de Montréal
Master's Research - Université de Montréal
Co-supervisor :
Research Intern - Université de Montréal
Collaborating researcher - Université de Montréal
PhD - Université de Montréal
Postdoctorate - Université de Montréal
PhD - Université de Montréal
Collaborating Alumni
Research Intern - Université de Montréal
Professional Master's - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
Research Intern - McGill University
Research Intern - Imperial College London
PhD - Université de Montréal
Research Intern - Université de Montréal
Collaborating Alumni - Université de Montréal
DESS - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
Postdoctorate - Université de Montréal
Collaborating researcher - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
Professional Master's - Université de Montréal
Independent visiting researcher - Université de Montréal
Independent visiting researcher - Hong Kong University of Science and Technology (HKUST)
Collaborating researcher - Ying Wu Coll of Computing
Professional Master's - Université de Montréal
Undergraduate - Université de Montréal
PhD - Max-Planck-Institute for Intelligent Systems
Professional Master's - Université de Montréal
Independent visiting researcher - Université de Montréal
Independent visiting researcher - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher
Principal supervisor :
Postdoctorate - Université de Montréal
Master's Research - Université de Montréal
Research Intern - Université de Montréal
Master's Research - Université de Montréal
Professional Master's - Université de Montréal
Independent visiting researcher - Technical University of Munich
PhD - École Polytechnique Montréal Fédérale de Lausanne
PhD - Université de Montréal
Co-supervisor :
Collaborating researcher
Principal supervisor :
Postdoctorate - Université de Montréal
Collaborating researcher - Valence
Principal supervisor :
Postdoctorate - Université de Montréal
Co-supervisor :
Collaborating researcher - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)
Principal supervisor :
PhD - Université de Montréal
Professional Master's - Université de Montréal
Collaborating Alumni - Université de Montréal
Research Intern - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
PhD - McGill University
Principal supervisor :
PhD - McGill University
Principal supervisor :

Publications

Cycle Consistency Driven Object Discovery
Aniket Rajiv Didolkar
Anirudh Goyal
Developing deep learning models that effectively learn object-centric representations, akin to human cognition, remains a challenging task. … (see more)Existing approaches facilitate object discovery by representing objects as fixed-size vectors, called ``slots'' or ``object files''. While these approaches have shown promise in certain scenarios, they still exhibit certain limitations. First, they rely on architectural priors which can be unreliable and usually require meticulous engineering to identify the correct objects. Second, there has been a notable gap in investigating the practical utility of these representations in downstream tasks. To address the first limitation, we introduce a method that explicitly optimizes the constraint that each object in a scene should be associated with a distinct slot. We formalize this constraint by introducing consistency objectives which are cyclic in nature. By integrating these consistency objectives into various existing slot-based object-centric methods, we showcase substantial improvements in object-discovery performance. These enhancements consistently hold true across both synthetic and real-world scenes, underscoring the effectiveness and adaptability of the proposed approach. To tackle the second limitation, we apply the learned object-centric representations from the proposed method to two downstream reinforcement learning tasks, demonstrating considerable performance enhancements compared to conventional slot-based and monolithic representation learning methods. Our results suggest that the proposed approach not only improves object discovery, but also provides richer features for downstream tasks.
Delta-AI: Local objectives for amortized inference in sparse graphical models
Jean-Pierre R. Falet
Hae Beom Lee
Nikolay Malkin
Chen Sun
Dragos Secrieru
Dinghuai Zhang
We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call …
Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization
Dinghuai Zhang
Ricky T. Q. Chen
Cheng-Hao Liu
Expected flow networks in stochastic environments and two-player zero-sum games
Marco Jiralerspong
Bilun Sun
Danilo Vucetic
Tianyu Zhang
Nikolay Malkin
Local Search GFlowNets
Minsu Kim
Taeyoung Yun
Emmanuel Bengio
Dinghuai Zhang
Sungsoo Ahn
Jinkyoo Park
Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their re… (see more)wards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via backtracking and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme, which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: https://github.com/dbsxodud-11/ls_gfn.
Object centric architectures enable efficient causal representation learning
Amin Mansouri
Jason Hartford
Yan Zhang
Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees… (see more) (up to some reasonable equivalence class). Common to all of these approaches is the assumption that (1) the latent variables are represented as
PhyloGFN: Phylogenetic inference with generative flow networks
Ming Yang Zhou
Zichao Yan
Elliot Layne
Nikolay Malkin
Dinghuai Zhang
Moksh J. Jain
Pre-Training and Fine-Tuning Generative Flow Networks
Ling Pan
Moksh J. Jain
Kanika Madan
Generative Flow Networks (GFlowNets) are amortized samplers that learn stochastic policies to sequentially generate compositional objects fr… (see more)om a given unnormalized reward distribution. They can generate diverse sets of high-reward objects, which is an important consideration in scientific discovery tasks. However, as they are typically trained from a given extrinsic reward function, it remains an important open challenge about how to leverage the power of pre-training and train GFlowNets in an unsupervised fashion for efficient adaptation to downstream tasks. Inspired by recent successes of unsupervised pre-training in various domains, we introduce a novel approach for reward-free pre-training of GFlowNets. By framing the training as a self-supervised problem, we propose an outcome-conditioned GFlowNet (OC-GFN) that learns to explore the candidate space. Specifically, OC-GFN learns to reach any targeted outcomes, akin to goal-conditioned policies in reinforcement learning. We show that the pre-trained OC-GFN model can allow for a direct extraction of a policy capable of sampling from any new reward functions in downstream tasks. Nonetheless, adapting OC-GFN on a downstream task-specific reward involves an intractable marginalization over possible outcomes. We propose a novel way to approximate this marginalization by learning an amortized predictor enabling efficient fine-tuning. Extensive experimental results validate the efficacy of our approach, demonstrating the effectiveness of pre-training the OC-GFN, and its ability to swiftly adapt to downstream tasks and discover modes more efficiently. This work may serve as a foundation for further exploration of pre-training strategies in the context of GFlowNets.
Tree Cross Attention
Leo Feng
Frederick Tung
Hossein Hajimirsadeghi
Mohamed Osama Ahmed
Cross Attention is a popular method for retrieving information from a set of context tokens for making predictions. At inference time, for e… (see more)ach prediction, Cross Attention scans the full set of
Simulation-Free Schrödinger Bridges via Score and Flow Matching
Alexander Tong
Nikolay Malkin
Kilian FATRAS
Lazar Atanackovic
Yanlei Zhang
Guillaume Huguet
We present simulation-free score and flow matching ([SF]…
A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems
Alexandre AGM Duval
Simon V. Mathis
Chaitanya K. Joshi
Victor Schmidt
Santiago Miret
Fragkiskos D. Malliaros
Taco Cohen
Pietro Lio’
Michael M. Bronstein
Improving Gradient-guided Nested Sampling for Posterior Inference
Pablo Lemos
Will Handley
Nikolay Malkin
We present a performant, general-purpose gradient-guided nested sampling algorithm, …