Yoshua Bengio

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Marie-Josée Beauchamp, adjointe administrative à marie-josee.beauchamp@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Collaborateur·rice alumni - McGill

Mohammed Abukalam

Collaborateur·rice alumni - UdeM

Berkes Anaïs

Collaborateur·rice de recherche - Cambridge University

Superviseur⋅e principal⋅e :

Rim Assouel

Doctorat - UdeM

Junyeob BAEK

Visiteur de recherche indépendant - KAIST

Visiteur de recherche indépendant

Co-superviseur⋅e :

Guillaume Lajoie

Paul Bertin

Doctorat - UdeM

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Collaborateur·rice de recherche - KAIST

Doctorat - UdeM

Doctorat - UdeM

Stagiaire de recherche - UdeM

Co-superviseur⋅e :

Loubna Benabbou

Eric Elmoznino

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - UdeM

Co-superviseur⋅e :

Leo Feng

Doctorat - UdeM

leo.feng@mila.quebec

Ivan Grega

Stagiaire de recherche - UdeM

Doctorat

Doctorat - UdeM

mohsin.hasan@mila.quebec

Edward Hu

Doctorat - UdeM

Moksh Jain

Doctorat - UdeM

moksh.jain@mila.quebec

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - UdeM

Hyeonah Kim

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Alex Hernandez

Yaroslav KIVVA

Collaborateur·rice de recherche - UdeM

Salem Lahlou

Collaborateur·rice alumni - UdeM

Tabitha Edith Lee

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Seanie Lee

Collaborateur·rice alumni - UdeM

Zhen Liu

Collaborateur·rice alumni - UdeM

Superviseur⋅e principal⋅e :

Liam Paull

Chenghao Liu

Collaborateur·rice alumni

Doctorat - UdeM

Collaborateur·rice alumni - UdeM

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Dhanya Sridhar

Sören Mindermann

Collaborateur·rice de recherche - UdeM

Sarthak Mittal

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Visiteur de recherche indépendant - UdeM

Padideh Nouri

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Ali Parviz

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Camille Rochefort-Boulanger

Lena Podina

Doctorat - University of Waterloo

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems

Amine RAZIG

Stagiaire de recherche - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Postdoctorat - UdeM

Visiteur de recherche indépendant - UdeM

Postdoctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Julie Hussin

Victor Schmidt

Collaborateur·rice alumni - UdeM

Postdoctorat - UdeM

Maîtrise recherche - UdeM

Marcin Sendera

Collaborateur·rice alumni - UdeM

Vedant Shah

Maîtrise recherche - UdeM

Postdoctorat

Marco Stock

Visiteur de recherche indépendant - Technical University of Munich

marco.stock@tum.de

Mélisande Astrid Crystal Teng

Doctorat - UdeM

Co-superviseur⋅e :

Hugo Larochelle

alexander.tong@mila.quebec

Alex Tong

Postdoctorat - UdeM

Postdoctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - UdeM

Omar G. Younis

Collaborateur·rice de recherche

Stagiaire de recherche - UdeM

Doctorat - UdeM

Doctorat - McGill

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Aaron Courville

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Harry Zhao

Doctorat - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

Scaling in the service of reasoning & model-based ML

4 avril 2023

par

Yoshua Bengio

Edward J. Hu

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

par

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

Les réseaux de flot génératifs

15 mars 2022

par

Yoshua Bengio

Publications

Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion

Chang Chen

Hany Hamed

Doojin Baek

Taegu Kang

Sungjin Ahn

2025-03-25

ArXiv (prépublication)

Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion

Chang Chen

Hany Hamed

Doojin Baek

Taegu Kang

Sungjin Ahn

2025-03-25

ArXiv (prépublication)

A scalable gene network model of regulatory dynamics in single cells

Paul Bertin

Joseph D Viviano

Alejandro Tejada-Lapuerta

Weixu Wang

Stefan Bauer

Fabian J. Theis

2025-03-25

ArXiv (prépublication)

A scalable gene network model of regulatory dynamics in single cells

Paul Bertin

Joseph D Viviano

Alejandro Tejada-Lapuerta

Weixu Wang

Stefan Bauer

Fabian J. Theis

2025-03-25

ArXiv (prépublication)

Offline Model-Based Optimization: Comprehensive Review

Minsu Kim

Jiayao Gu

Ye Yuan

Taeyoung Yun

Zixuan Liu

Can Chen

2025-03-21

ArXiv (prépublication)

Offline Model-Based Optimization: Comprehensive Review

Minsu Kim

Jiayao Gu

Ye Yuan

Taeyoung Yun

Zixuan Liu

Can Chen

2025-03-21

ArXiv (prépublication)

Learning Decision Trees as Amortized Structure Inference

Mohammed Mahfoud

Ghait Boukachab

Michał Koziarski

Alex Hernandez-Garcia

Stefan Bauer

Nikolay Malkin

2025-03-10

ArXiv (prépublication)

Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles

Luca Scimeca

Alexander Rubinstein

Damien Teney

Seong Joon Oh

Armand Mihai Nicolicioiu

Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to a phenomenon known as shortcut lea… (voir plus)rning, where a model relies on erroneous, easy-to-learn cues while ignoring reliable ones. In this work, we propose

2025-03-06

ICLR.cc/2025/Workshop/SCSL (publié)

Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control

Thomas Jiralerspong

Berton Earnshaw

Jason Hartford

Luca Scimeca

Diffusion Probabilistic Models (DPMs) are powerful generative models that have achieved unparalleled success in a number of generative tasks… (voir plus). In this work, we aim to build inductive biases into the training and sampling of diffusion models to better accommodate the target distribution of the data to model. For topologically structured data, we devise a frequency-based noising operator to purposefully manipulate, and set, these inductive biases. We first show that appropriate manipulations of the noising forward process can lead DPMs to focus on particular aspects of the distribution to learn. We show that different datasets necessitate different inductive biases, and that appropriate frequency-based noise control induces increased generative performance compared to standard diffusion. Finally, we demonstrate the possibility of ignoring information at particular frequencies while learning. We show this in an image corruption and recovery task, where we train a DPM to recover the original target distribution after severe noise corruption.

2025-03-06

ICLR.cc/2025/Workshop/DeLTa (poster)

Laurence Perreault-Levasseur

Solving Bayesian inverse problems with diffusion priors and off-policy RL

Luca Scimeca

Siddarth Venkatraman

Moksh J. Jain

Minsu Kim

Marcin Sendera

Mohsin Hasan

Luke Rowe

Sarthak Mittal

Pablo Lemos

Emmanuel Bengio

Alexandre Adam

Jarrid Rector-Brooks

Yashar Hezaveh

Glen Berseth

Nikolay Malkin

This paper presents a practical application of Relative Trajectory Balance (RTB), a recently introduced off-policy reinforcement learning (R… (voir plus)L) objective that can asymptotically solve Bayesian inverse problems optimally. We extend the original work by using RTB to train conditional diffusion model posteriors from pretrained unconditional priors for challenging linear and non-linear inverse problems in vision, and science. We use the objective alongside techniques such as off-policy backtracking exploration to improve training. Importantly, our results show that existing training-free diffusion posterior methods struggle to perform effective posterior inference in latent space due to inherent biases.

2025-03-06

ICLR.cc/2025/Workshop/DeLTa (poster)

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Ahmed Masry

Juan A. Rodriguez

Tianyu Zhang

Suyuchen Wang

Chao Wang

Aarash Feizi

Akshay Kalkunte Suresh

Abhay Puri

Xiangru Jian

Pierre-Andre Noel

Sathwik Tejaswi Madhusudhan

Enamul Hoque

Issam Hadj Laradji

David Vazquez

Perouz Taslakian … (voir 2 de plus)

Spandana Gella

Sai Rajeswar

Aligning visual features with language embeddings is a key challenge in vision-language models (VLMs). The performance of such models hinges… (voir plus) on having a good connector that maps visual features generated by a vision encoder to a shared embedding space with the LLM while preserving semantic similarity. Existing connectors, such as multilayer perceptrons (MLPs), often produce out-of-distribution or noisy inputs, leading to misalignment between the modalities. In this work, we propose a novel vision-text alignment method, AlignVLM, that maps visual features to a weighted average of LLM text embeddings. Our approach leverages the linguistic priors encoded by the LLM to ensure that visual features are mapped to regions of the space that the LLM can effectively interpret. AlignVLM is particularly effective for document understanding tasks, where scanned document images must be accurately mapped to their textual content. Our extensive experiments show that AlignVLM achieves state-of-the-art performance compared to prior alignment methods. We provide further analysis demonstrating improved vision-text feature alignment and robustness to noise.

2025-03-05

ICLR.cc/2025/Workshop/Re-Align (poster)

Learning Decision Trees as Amortized Structure Inference

Mohammed Mahfoud

Ghait Boukachab

Michał Koziarski

Alex Hernandez-Garcia

Stefan Bauer

Nikolay Malkin

2025-03-05

ICLR.cc/2025/Workshop/FPI (poster)