Yoshua Bengio

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Marie-Josée Beauchamp, adjointe administrative à marie-josee.beauchamp@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Collaborateur·rice alumni - McGill

Mohammed Abukalam

Collaborateur·rice alumni - UdeM

agassoussisalwane2@gmail.com

Salwane Agassoussi

UdeM

Berkes Anaïs

Collaborateur·rice de recherche - Cambridge University

Superviseur⋅e principal⋅e :

Rim Assouel

Doctorat - UdeM

Ayoub Atanane

Collaborateur·rice alumni - Université du Québec à Rimouski

Stefan Bauer

Visiteur de recherche indépendant

Co-superviseur⋅e :

Guillaume Lajoie

Paul Bertin

Doctorat - UdeM

Ghait Boukachab

Collaborateur·rice alumni - UQAR

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Collaborateur·rice de recherche - KAIST

Doctorat - UdeM

Doctorat - UdeM

Collaborateur·rice alumni - UdeM

Eric Elmoznino

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - UdeM

Doctorat - UdeM

Co-superviseur⋅e :

Leo Feng

Doctorat - UdeM

Stagiaire de recherche - UdeM

Ivan Grega

Stagiaire de recherche - UdeM

Pietro Greiner

Doctorat

Mohsin Hasan

Doctorat - UdeM

mohsin.hasan@mila.quebec

Edward Hu

Doctorat - UdeM

Moksh Jain

Doctorat - UdeM

moksh.jain@mila.quebec

Maîtrise recherche - UdeM

Co-superviseur⋅e :

Collaborateur·rice alumni - UdeM

Minsu Kim

Stagiaire de recherche - UdeM

Hyeonah Kim

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Alex Hernandez

Yaroslav KIVVA

Collaborateur·rice de recherche - UdeM

Michał Koziarski

Collaborateur·rice alumni - UdeM

Salem Lahlou

Collaborateur·rice alumni - UdeM

Tabitha Edith Lee

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Seanie Lee

Collaborateur·rice alumni - UdeM

Collaborateur·rice alumni

Zhen Liu

Collaborateur·rice alumni - UdeM

Superviseur⋅e principal⋅e :

Liam Paull

Kanika Madan

Doctorat - UdeM

Mohammed Mahfoud

Collaborateur·rice alumni - UdeM

Nikolay Malkin

Collaborateur·rice alumni - UdeM

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Dhanya Sridhar

Sören Mindermann

Collaborateur·rice de recherche - UdeM

Sarthak Mittal

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Visiteur de recherche indépendant - UdeM

Padideh Nouri

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Ali Parviz

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Lena Podina

Doctorat - University of Waterloo

Superviseur⋅e principal⋅e :

David Rolnick

Camille Rochefort-Boulanger

Nassim Rahaman

Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems

Doctorat - UdeM

Postdoctorat - UdeM

Visiteur de recherche indépendant - UdeM

Postdoctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Julie Hussin

Victor Schmidt

Collaborateur·rice alumni - UdeM

Postdoctorat - UdeM

Maîtrise recherche - UdeM

Marcin Sendera

Collaborateur·rice alumni - UdeM

Vedant Shah

Maîtrise recherche - UdeM

Postdoctorat

Marco Stock

Visiteur de recherche indépendant - Technical University of Munich

marco.stock@tum.de

Mélisande Astrid Crystal Teng

Doctorat - UdeM

Co-superviseur⋅e :

Collaborateur·rice de recherche - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)

Superviseur⋅e principal⋅e :

David Rolnick

alexander.tong@mila.quebec

Alex Tong

Postdoctorat - UdeM

Postdoctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - UdeM

Zichao Yan

Collaborateur·rice alumni - UdeM

Omar G. Younis

Collaborateur·rice de recherche

Collaborateur·rice de recherche - KAIST

Doctorat - UdeM

Doctorat - McGill

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Aaron Courville

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Harry Zhao

Doctorat - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

Scaling in the service of reasoning & model-based ML

4 avril 2023

par

Yoshua Bengio

Edward J. Hu

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

par

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

Les réseaux de flot génératifs

15 mars 2022

par

Yoshua Bengio

Publications

Boundary Seeking GANs

(Rex) Devon Hjelm

Athul Jacob

Adam Trischler

Gerry Che

Kyunghyun Cho

2018-02-15

International Conference on Learning Representations (publié)

dblp.uni-trier.de

Combining Model-based and Model-free RL via Multi-step Control Variates

Tong Che

Yuchen Lu

George Tucker

Surya Bhupatiraju

Shane Gu

Sergey Levine

2018-02-15

(publié)

Learning Generative Models with Locally Disentangled Latent Factors

Brady Neal

Alex Lamb

Sherjil Ozair

One of the most successful techniques in generative models has been decomposing a complicated generation task into a series of simpler gener… (voir plus)ation tasks. For example, generating an image at a low resolution and then learning to refine that into a high resolution image often improves results substantially. Here we explore a novel strategy for decomposing generation for complicated objects in which we first generate latent variables which describe a subset of the observed variables, and then map from these latent variables to the observed space. We show that this allows us to achieve decoupled training of complicated generative models and present both theoretical and experimental results supporting the benefit of such an approach.

2018-02-15

(publié)

Finding Flatter Minima with SGD

Stanisław Jastrzębski

Zac Kenton

Devansh Arpit

Nicolas Ballas

Asja Fischer

Amos Storkey

2018-02-12

International Conference on Learning Representations (publié)

dblp.uni-trier.de

Graph Priors for Deep Neural Networks

Francis Dutil

Joseph Paul Cohen

Martin Weiss

Georgy Derevyanko

In this work we explore how gene-gene interaction graphs can be used as a prior for the representation of a model to construct features base… (voir plus)d on known interactions between genes. Most existing machine learning work on graphs focuses on building models when data is confined to a graph structure. In this work we focus on using the information from a graph to build better representations in our models. We use the percolate task, determining if a path exists across a grid for a set of node values, as a proxy for gene pathways. We create variants of the percolate task to explore where existing methods fail. We test the limits of existing methods in order to determine what can be improved when applying these methods to a real task. This leads us to propose new methods based on Graph Convolutional Networks (GCN) that use pooling and dropout to deal with noise in the graph prior.

2018-02-12

(publié)

SGD S MOOTHS THE S HARPEST D IRECTIONS

Stanisław Jastrzębski

Zac Kenton

Nicolas Ballas

Asja Fischer

Amos Storkey

Stochastic gradient descent (SGD) is able to find regions that generalize well, even in drastically over-parametrized models such as deep ne… (voir plus)ural networks. We observe that noise in SGD controls the spectral norm and conditioning of the Hessian throughout the training. We hypothesize the cause of this phenomenon is due to the dynamics of neurons saturating their non-linearity along the largest curvature directions, thus leading to improved conditioning.

2018-02-12

(publié)

Extending the Framework of Equilibrium Propagation to General Dynamics

Benjamin Scellier

Anirudh Goyal

Jonathan Binas

Thomas Mesnard

2018-02-11

International Conference on Learning Representations (publié)

A Deep Reinforcement Learning Chatbot (Short Version)

Iulian V. Serban

Chinnadhurai Sankar

Mathieu Germain

Saizheng Zhang

Zhouhan Lin

Sandeep Subramanian

Taesup Kim

Michael Pieper

Sarath Chandar

Nan Rosemary Ke

Sai Rajeswar

Alexandre De Brébisson

Jose Sotelo

Dendi Suhubdy

Vincent Michalski

Alexandre Nguyen

Joelle Pineau

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon … (voir plus)Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.

2018-01-20

ArXiv (prépublication)

arxiv.org

A3T: Adversarially Augmented Adversarial Training

Akram Erraqabi

Aristide Baratin

Simon Lacoste-Julien

Recent research showed that deep neural networks are highly sensitive to so-called adversarial perturbations, which are tiny perturbations o… (voir plus)f the input data purposely designed to fool a machine learning classifier. Most classification models, including deep learning models, are highly vulnerable to adversarial attacks. In this work, we investigate a procedure to improve adversarial robustness of deep neural networks through enforcing representation invariance. The idea is to train the classifier jointly with a discriminator attached to one of its hidden layer and trained to filter the adversarial noise. We perform preliminary experiments to test the viability of the approach and to compare it to other standard adversarial training methods.

2018-01-12

ArXiv (prépublication)

arxiv.org

Bayesian Model-Agnostic Meta-Learning

Taesup Kim

Jaesik Yoon

Ousmane Dia

Sungwoong Kim

Sungjin Ahn

Learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning due to the model uncertainty … (voir plus)inherent in the problem. In this paper, we propose a novel Bayesian model-agnostic meta-learning method. The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework. During fast adaptation, the method is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation. In addition, a robust Bayesian meta-update mechanism with a new meta-loss prevents overfitting during meta-update. Remaining an efficient gradient-based meta-learner, the method is also model-agnostic and simple to implement. Experiment results show the accuracy and robustness of the proposed method in various tasks: sinusoidal regression, image classification, active learning, and reinforcement learning.

arxiv.org

BigBrain: 1D convolutional neural networks for automated sementation of cortical layers

Konrad Wagstyl

Claude Lepage

Karl Zilles

Sebastian Bludau

G. Cucurul

Alan C. Evans

Paul C Fletcher

Adriana Romero Soriano

Joseph Paul Cohen

Stéphanie Larocque

Thomas Funck

Katrin Amunts

Boundary Seeking GANs

(Rex) Devon Hjelm

Athul Jacob

Adam Trischler

Gerry Che

Kyunghyun Cho

Generative adversarial networks are a learning framework that rely on training a discriminator to estimate a measure of difference between a… (voir plus) target and generated distributions. GANs, as normally formulated, rely on the generated samples being completely differentiable w.r.t. the generative parameters, and thus do not work for discrete data. We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator. The importance weights have a strong connection to the decision boundary of the discriminator, and we call our method boundary-seeking GANs (BGANs). We demonstrate the effectiveness of the proposed algorithm with discrete image and character-based natural language generation. In addition, the boundary-seeking objective extends to continuous data, which can be used to improve stability of training, and we demonstrate this on Celeba, Large-scale Scene Understanding (LSUN) bedrooms, and Imagenet without conditioning.

2018-01-01

ICLR.cc/2018/Conference (poster)