Yoshua Bengio

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Cassidy MacNeil, adjointe principale et responsable des opérations cassidy.macneil@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Collaborateur·rice alumni - McGill

Berkes Anaïs

Collaborateur·rice de recherche - Cambridge University

Superviseur⋅e principal⋅e :

Rim Assouel

Doctorat - UdeM

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Collaborateur·rice de recherche - KAIST

Doctorat - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Loubna Benabbou

Desmond Elliott

Visiteur de recherche indépendant

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - UdeM

Doctorat

Doctorat - UdeM

Moksh Jain

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - UdeM

Hyeonah Kim

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Alex Hernández-García

Minsu Kim

Stagiaire de recherche - UdeM

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni

Song LIU

Collaborateur·rice de recherche - s.o.

Nikolay Malkin

Collaborateur·rice de recherche - UdeM

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Dhanya Sridhar

Sarthak Mittal

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Visiteur de recherche indépendant - UdeM

Padideh Nouri

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Ali Parviz

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Lena Podina

Collaborateur·rice de recherche - University of Waterloo

Superviseur⋅e principal⋅e :

David Rolnick

Camille Rochefort-Boulanger

Nassim Rahaman

Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems

Doctorat - UdeM

Postdoctorat - UdeM

Postdoctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Julie Hussin

Dragos Secrieru

Collaborateur·rice alumni - UdeM

Divya Sharma

Postdoctorat

Co-superviseur⋅e :

Alex Hernández-García

Mélisande Astrid Crystal Teng

Vincent Taboga

Collaborateur·rice alumni - Polytechnique

Co-superviseur⋅e :

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Hugo Larochelle

Ivan Titov

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

Siva Reddy

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Alex Tong

Collaborateur·rice alumni - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche

Collaborateur·rice de recherche - UdeM

Doctorat - UdeM

Doctorat - McGill

Superviseur⋅e principal⋅e :

Harry Zhao

Collaborateur·rice alumni - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

Scaling in the service of reasoning & model-based ML

4 avril 2023

par

Yoshua Bengio

Edward J. Hu

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

par

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

Les réseaux de flot génératifs

15 mars 2022

par

Yoshua Bengio

Publications

Recurrent Independent Mechanisms

Sergey Levine

Bernhard Schölkopf

Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which … (voir plus)only affect a few of the underlying causes. We propose Recurrent Independent Mechanisms (RIMs), a new recurrent architecture in which multiple groups of recurrent cells operate with nearly independent transition dynamics, communicate only sparingly through the bottleneck of attention, and are only updated at time steps where they are most relevant. We show that this leads to specialization amongst the RIMs, which in turn allows for dramatically improved generalization on tasks where some factors of variation differ systematically between training and evaluation.

2021-01-11

ICLR.cc/2021/Conference (spotlight)

RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs

Meng Qu

Junkun Chen

Louis-Pascal Xhonneux

Jian Tang

This paper studies learning logic rules for reasoning on knowledge graphs. Logic rules provide interpretable explanations when used for pred… (voir plus)iction as well as being able to generalize to other tasks, and hence are critical to learn. Existing methods either suffer from the problem of searching in a large search space (e.g., neural logic programming) or ineffective optimization due to sparse rewards (e.g., techniques based on reinforcement learning). To address these limitations, this paper proposes a probabilistic model called RNNLogic. RNNLogic treats logic rules as a latent variable, and simultaneously trains a rule generator as well as a reasoning predictor with logic rules. We develop an EM-based algorithm for optimization. In each iteration, the reasoning predictor is first updated to explore some generated logic rules for reasoning. Then in the E-step, we select a set of high-quality rules from all generated rules with both the rule generator and reasoning predictor via posterior inference; and in the M-step, the rule generator is updated with the rules selected in the E-step. Experiments on four datasets prove the effectiveness of RNNLogic.

2021-01-11

ICLR.cc/2021/Conference (poster)

Spatially Structured Recurrent Modules

Nasim Rahaman

Anirudh Goyal

Muhammad Waleed Gondal

Manuel Wuthrich

Stefan Bauer

Yash Sharma

Bernhard Schölkopf

Capturing the structure of a data-generating process by means of appropriate inductive biases can help in learning models that generalise we… (voir plus)ll and are robust to changes in the input distribution. While methods that harness spatial and temporal structures find broad application, recent work has demonstrated the potential of models that leverage sparse and modular structure using an ensemble of sparingly interacting modules. In this work, we take a step towards dynamic models that are capable of simultaneously exploiting both modular and spatiotemporal structures. To this end, we model the dynamical system as a collection of autonomous but sparsely interacting sub-systems that interact according to a learned topology which is informed by the spatial structure of the underlying system. This gives rise to a class of models that are well suited for capturing the dynamics of systems that only offer local views into their state, along with corresponding spatial locations of those views. On the tasks of video prediction from cropped frames and multi-agent world modelling from partial observations in the challenging Starcraft2 domain, we find our models to be more robust to the number of available views and better capable of generalisation to novel tasks without additional training than strong baselines that perform equally well or better on the training distribution.

2021-01-11

ICLR.cc/2021/Conference (poster)

Attention Based Pruning for Shift Networks

Ghouthi Boukli hacene

Carlos Lassance

Vincent Gripon

Matthieu Courbariaux

In many application domains such as computer vision, Convolutional Layers (CLs) are key to the accuracy of deep learning methods. However, i… (voir plus)t is often required to assemble a large number of CLs, each containing thousands of parameters, in order to reach state-of-the-art accuracy, thus resulting in complex and demanding systems that are poorly fitted to resource-limited devices. Recently, methods have been proposed to replace the generic convolution operator by the combination of a shift operation and a simpler 1x1 convolution. The resulting block, called Shift Layer (SL), is an efficient alternative to CLs in the sense it allows to reach similar accuracies on various tasks with faster computations and fewer parameters. In this contribution, we introduce Shift Attention Layers (SALs), which extend SLs by using an attention mechanism that learns which shifts are the best at the same time the network function is trained. We demonstrate SALs are able to outperform vanilla SLs (and CLs) on various object recognition benchmarks while significantly reducing the number of float operations and parameters for the inference.

2021-01-09

2020 25th International Conference on Pattern Recognition (ICPR) (publié)

Reza Babanezhad Harikandeh

arxiv.org

An Analysis of the Adaptation Speed of Causal Models

Rémi Le Priol

Simon Lacoste-Julien

Consider a collection of datasets generated by unknown interventions on an unknown structural causal model …

2020-12-31

AISTATS (publié)

proceedings.mlr.press

C AUSAL R: Causal Reasoning over Natural Language Rulebases

Jason Weston

Antoine Bordes

Sumit Chopra

Thomas Wolf

Lysandre Debut

Julien Victor Sanh

Clement Chaumond

Anthony Delangue

Pier-339 Moi

Tim ric Cistac

R´emi Rault

Morgan Louf

Funtow-900 Joe

Sam Davison

Patrick Shleifer

Von Platen

Clara Ma

Yacine Jernite

Julien Plu

Canwen Xu … (voir 6 de plus)

Zhilin Yang

Peng Qi

Saizheng Zhang

William W Cohen

Russ Salakhutdinov

Transformers have been shown to be able to 001 perform deductive reasoning on a logical rule-002 base containing rules and statements writte… (voir plus)n 003 in natural language. Recent works show that 004 such models can also produce the reasoning 005 steps (i.e., the proof graph ) that emulate the 006 model’s logical reasoning process. But these 007 models behave as a black-box unit that emu-008 lates the reasoning process without any causal 009 constraints in the reasoning steps, thus ques-010 tioning the faithfulness. In this work, we frame 011 the deductive logical reasoning task as a causal 012 process by defining three modular components: 013 rule selection, fact selection, and knowledge 014 composition. The rule and fact selection steps 015 select the candidate rule and facts to be used 016 and then the knowledge composition combines 017 them to generate new inferences. This ensures 018 model faithfulness by assured causal relation 019 from the proof step to the inference reasoning. 020 To test our causal reasoning framework, we 021 propose C AUSAL R where the above three com-022 ponents are independently modeled by trans-023 formers. We observe that C AUSAL R is robust 024 to novel language perturbations, and is com-025 petitive with previous works on existing rea-026 soning datasets. Furthermore, the errors made 027 by C AUSAL R are more interpretable due to 028 the multi-modular approach compared to black-029 box generative models. 1 030

2020-12-31

(publié)

www.semanticscholar.org

BabyAI 1.1

David Y. T. Hui

Maxime Chevalier-Boisvert

Dzmitry Bahdanau

The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 p… (voir plus)resents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.1 improves the agent's architecture in three minor ways. This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90.4 %. We hope that these improvements increase the computational efficiency of BabyAI experiments and help users design better agents.

2020-12-31

(publié)

arxiv.org

CAMAP: Artificial neural networks unveil the role of 1 codon arrangement in modulating MHC-I peptides 2 presentation

Tariq Daouda

Maude Dumont-Lagacé

Albert Feghaly

Yahya Benslimane

6. Rébecca

Panes

Mathieu Courcelles

Mohamed Benhammadi

Lea Harrington

Pierre Thibault

François Major

Étienne Gagnon

S. Lemieux

Claude Perreault

30 MHC-I associated peptides (MAPs) play a central role in the elimination of virus-infected and 31 neoplastic cells by CD8 T cells. However… (voir plus), accurately predicting the MAP repertoire remains 32 difficult, because only a fraction of the transcriptome generates MAPs. In this study, we 33 investigated whether codon arrangement (usage and placement) regulates MAP biogenesis. We 34 developed an artificial neural network called Codon Arrangement MAP Predictor (CAMAP), 35 predicting MAP presentation solely from mRNA sequences flanking the MAP-coding codons 36 (MCCs), while excluding the MCC per se . CAMAP predictions were significantly more accurate 37 when using original codon sequences than shuffled codon sequences which reflect amino acid 38 usage. Furthermore, predictions were independent of mRNA expression and MAP binding affinity 39 to MHC-I molecules and applied to several cell types and species. Combining MAP ligand scores, 40 transcript expression level and CAMAP scores was particularly useful to increaser MAP prediction 41 accuracy. Using an in vitro assay, we showed that varying the synonymous codons in the regions 42 flanking the MCCs (without changing the amino acid sequence) resulted in significant modulation 43 of MAP presentation at the cell surface. Taken together, our results demonstrate the role of codon 44 arrangement in the regulation of MAP presentation and support integration of both translational 45 and post-translational events in predictive algorithms to ameliorate modeling of the 46 immunopeptidome. 47 48 49 they modulated the levels of SIINFEKL presentation in both constructs, but enhanced translation efficiency could only be detected for OVA-RP. These data show that codon arrangement can modulate MAP presentation strength without any changes in the amino

2020-12-31

(publié)

www.semanticscholar.org

CAMAP: Artificial neural networks unveil the role of codon arrangement in modulating MHC-I peptides presentation

Tariq Daouda

Maude Dumont-Lagacé

Albert Feghaly

Yahya Benslimane

Rebecca Panes

Mathieu Courcelles

Mohamed Benhammadi

Lea Harrington

Pierre Thibault

François Major

Étienne Gagnon

Sébastien Lemieux

Claude Perreault

MHC-I associated peptides (MAPs) are small fragments of intracellular proteins presented at the surface of cells and used by the immune syst… (voir plus)em to detect and eliminate cancerous or virus-infected cells. While it is theoretically possible to predict which portions of the intracellular proteins will be naturally processed by the cells to ultimately reach the surface, current methodologies have prohibitively high false discovery rates. Here we introduce an artificial neural network called Codon Arrangement MAP Predictor (CAMAP) which integrates information from mRNA-to-protein translation to other factors regulating MAP biogenesis (e.g. MAP ligand score and transcript expression levels) to improve MAP prediction accuracy. While most MAP predictive approaches focus on MAP sequences per se, CAMAP’s novelty is to analyze the MAP-flanking mRNA sequences, thereby providing completely independent information for MAP prediction. We show on several datasets that the integration of CAMAP scores with other known factors involved in MAP presentation (i.e. MAP ligand score and mRNA expression) significantly improves MAP prediction accuracy, and further validate CAMAP learned features using anin-vitroassay. These findings may have major implications for the design of vaccines against cancers and viruses, and in times of pandemics could accelerate the identification of relevant MAPs of viral origins.

2020-12-31

PLoS Computational Biology (publié)

CAMAP: Artificial neural networks unveil the role of 1 codon arrangement in modulating MHC-I peptides 2 presentation discovery of minor histocompatibility with

Tariq Daouda

Maude Dumont-Lagacé

Albert Feghaly

Yahya Benslimane

6. Rébecca

Panes

Mathieu Courcelles

Mohamed Benhammadi

Lea Harrington

Pierre Thibault

François Major

Étienne Gagnon

S. Lemieux

Claude Perreault

2020-12-31

(publié)

www.semanticscholar.org

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

Mingde Zhao

We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state during plan… (voir plus)ning. The agent uses a bottleneck mechanism over a set-based representation to force the number of entities to which the agent attends at each planning step to be small. In experiments, we investigate the bottleneck mechanism with several sets of customized environments featuring different challenges. We consistently observe that the design allows the planning agents to generalize their learned task-solving abilities in compatible unseen environments by attending to the relevant objects, leading to better out-of-distribution generalization performance.

2020-12-31

Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (publié)

Cooperative Semi-Supervised Transfer Learning of Machine Reading Comprehension

Oliver Bender

Franz Josef Och