Yoshua Bengio

Biographie

*Pour toute demande média, veuillez écrire à medias@mila.quebec.

Pour plus d’information, contactez Cassidy MacNeil, adjointe principale et responsable des opérations cassidy.macneil@mila.quebec.

Reconnu comme une sommité mondiale en intelligence artificielle, Yoshua Bengio s’est surtout distingué par son rôle de pionnier en apprentissage profond, ce qui lui a valu le prix A. M. Turing 2018, le « prix Nobel de l’informatique », avec Geoffrey Hinton et Yann LeCun. Il est professeur titulaire à l’Université de Montréal, fondateur et conseiller scientifique de Mila – Institut québécois d’intelligence artificielle, et codirige en tant que senior fellow le programme Apprentissage automatique, apprentissage biologique de l'Institut canadien de recherches avancées (CIFAR). Il occupe également la fonction de conseiller spécial et directeur scientifique fondateur d’IVADO.

En 2018, il a été l’informaticien qui a recueilli le plus grand nombre de nouvelles citations au monde. En 2019, il s’est vu décerner le prestigieux prix Killam. Depuis 2022, il détient le plus grand facteur d’impact (h-index) en informatique à l’échelle mondiale. Il est fellow de la Royal Society de Londres et de la Société royale du Canada, et officier de l’Ordre du Canada.

Soucieux des répercussions sociales de l’IA et de l’objectif que l’IA bénéficie à tous, il a contribué activement à la Déclaration de Montréal pour un développement responsable de l’intelligence artificielle.

Étudiants actuels

Jamal Abou Haibeh

Collaborateur·rice alumni - McGill

Berkes Anaïs

Collaborateur·rice de recherche - Cambridge University

Superviseur⋅e principal⋅e :

Rim Assouel

Doctorat - UdeM

Stefan Bauer

Visiteur de recherche indépendant

Co-superviseur⋅e :

Guillaume Lajoie

Shahana Chatterjee

Collaborateur·rice de recherche - N/A

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Collaborateur·rice de recherche - KAIST

Doctorat - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Loubna Benabbou

Desmond Elliott

Visiteur de recherche indépendant

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - UdeM

Doctorat

Doctorat - UdeM

Moksh Jain

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - UdeM

Hyeonah Kim

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Alex Hernandez-Garcia

Tabitha Edith Lee

Postdoctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni

Collaborateur·rice alumni - UdeM

Cristian Dragos Manta

Doctorat - UdeM

Co-superviseur⋅e :

Dhanya Sridhar

Sarthak Mittal

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Visiteur de recherche indépendant - UdeM

Padideh Nouri

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Ali Parviz

Collaborateur·rice de recherche - Ying Wu Coll of Computing

Lena Podina

Collaborateur·rice de recherche - University of Waterloo

Superviseur⋅e principal⋅e :

David Rolnick

Camille Rochefort-Boulanger

Nassim Rahaman

Collaborateur·rice alumni - Max-Planck-Institute for Intelligent Systems

Amine RAZIG

Collaborateur·rice de recherche - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Postdoctorat - UdeM

Postdoctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Julie Hussin

Dragos Secrieru

Collaborateur·rice alumni - UdeM

Divya Sharma

Postdoctorat

Co-superviseur⋅e :

Alex Hernandez-Garcia

Mélisande Astrid Crystal Teng

Vincent Taboga

Collaborateur·rice alumni - Polytechnique

Co-superviseur⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Hugo Larochelle

Ivan Titov

Collaborateur·rice de recherche

Superviseur⋅e principal⋅e :

Siva Reddy

Alex Tong

Collaborateur·rice alumni - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche

Collaborateur·rice de recherche - UdeM

Doctorat - UdeM

Doctorat - McGill

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Aaron Courville

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Harry Zhao

Collaborateur·rice alumni - McGill

Superviseur⋅e principal⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Mise à l’échelle au service du raisonnement et de l’apprentissage automatique basé sur un modèle

Scaling in the service of reasoning & model-based ML

4 avril 2023

par

Yoshua Bengio

Edward J. Hu

Une collaboration entre Mila et Relation Therapeutics pour découvrir in vitro de nouvelles associations médicamenteuses synergiques

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

23 mars 2022

par

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

Les réseaux de flot génératifs

15 mars 2022

par

Yoshua Bengio

Publications

»Deep Learning ist keine Religion«

Andreas Sudmann

2018-12-30

Machine-mediated learning (publié)

Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks

Ghouthi Boukli hacene

Vincent Gripon

Matthieu Arzel

Nicolas Farrugia

Convolutional Neural Networks (CNNs) are state-of-the-art in numerous computer vision tasks such as object classification and detection. How… (voir plus)ever, the large amount of parameters they contain leads to a high computational complexity and strongly limits their usability in budget-constrained devices such as embedded devices. In this paper, we propose a combination of a new pruning technique and a quantization scheme that effectively reduce the complexity and memory usage of convolutional layers of CNNs, and replace the complex convolutional operation by a low-cost multiplexer. We perform experiments on the CIFAR10, CIFAR100 and SVHN and show that the proposed method achieves almost state-of-the-art accuracy, while drastically reducing the computational and memory footprints. We also propose an efficient hardware architecture to accelerate CNN operations. The proposed hardware architecture is a pipeline and accommodates multiple layers working at the same time to speed up the inference process.

2018-12-24

ArXiv (prépublication)

Speaker Recognition from Raw Waveform with SincNet

Mirco Ravanelli

Deep learning is progressively gaining popularity as a viable alternative to i-vectors for speaker recognition. Promising results have been … (voir plus)recently obtained with Convolutional Neural Networks (CNNs) when fed by raw speech samples directly. Rather than employing standard hand-crafted features, the latter CNNs learn low-level speech representations from waveforms, potentially allowing the network to better capture important narrow-band speaker characteristics such as pitch and formants. Proper design of the neural network is crucial to achieve this goal. This paper proposes a novel CNN architecture, called SincNet, that encourages the first convolutional layer to discover more meaningful filters. SincNet is based on parametrized sinc functions, which implement band-pass filters. In contrast to standard CNNs, that learn all elements of each filter, only low and high cutoff frequencies are directly learned from data with the proposed method. This offers a very compact and efficient way to derive a customized filter bank specifically tuned for the desired application. Our experiments, conducted on both speaker identification and speaker verification tasks, show that the proposed architecture converges faster and performs better than a standard CNN on raw waveforms.

2018-12-17

2018 IEEE Spoken Language Technology Workshop (SLT) (publié)

Object Detection using Deep Learning

Chamarty Anusha

P. Avadhani

P. S.

Mohannad Elhamod

Martin D. Levine

Ajeet Ram Pathak

Manjusha Pandey

Siddharth S. Rautaray

Christian Szegedy

Alexander T Toshev

Dumitru Erhan

Xiao Ning

Wen Zhu

Shifeng Chen

Zhong-Qiu Zhao

Peng Zheng

Shou-tao Xu

Xindong Wu

Sakshi Indolia

Anil Kumar Goswani … (voir 12 de plus)

S. P. Mishra

Pooja Asopa

Yann Lecun

Joseph Redmon

Santosh Kumar Divvala

Ross Girshick

Ali Farhadi

M. Kruithof

Henri Bouma

Noelle M. Fischer

Klamer Schutte

Autonomous vehicles, surveillance systems, face detection systems lead to the development of accurate object detection system [1]. These sys… (voir plus)tems recognize, classify and localize every object in an image by drawing bounding boxes around the object [2]. These systems use existing classification models as backbone for Object Detection purpose. Object detection is the process of finding instances of real-world objects such as human faces, animals and vehicles etc., in pictures, images or in videos. An Object detection algorithm uses extracted features and learning techniques to recognize the objects in an image. In this paper, various Object Detection techniques have been studied and some of them are implemented. As a part of this paper, three algorithms for object detection in an image were implemented and their results were compared. The algorithms are “Object Detection using Deep Learning Framework by OpenCV”, “Object Detection using Tensorflow” and “Object Detection using Keras models”.

2018-12-16

International Journal of Computer Applications (publié)

Speech and Speaker Recognition from Raw Waveform with SincNet

Mirco Ravanaelli

Deep neural networks can learn complex and abstract representations, that are progressively obtained by combining simpler ones. A recent tre… (voir plus)nd in speech and speaker recognition consists in discovering these representations starting from raw audio samples directly. Differently from standard hand-crafted features such as MFCCs or FBANK, the raw waveform can potentially help neural networks discover better and more customized representations. The high-dimensional raw inputs, however, can make training significantly more challenging. This paper summarizes our recent efforts to develop a neural architecture that efficiently processes speech from audio waveforms. In particular, we propose SincNet, a novel Convolutional Neural Network (CNN) that encourages the first layer to discover meaningful filters by exploiting parametrized sinc functions. In contrast to standard CNNs, which learn all the elements of each filter, only low and high cutoff frequencies of band-pass filters are directly learned from data. This inductive bias offers a very compact way to derive a customized front-end, that only depends on some parameters with a clear physical meaning. Our experiments, conducted on both speaker and speech recognition, show that the proposed architecture converges faster, performs better, and is more computationally efficient than standard CNNs.

2018-12-12

ArXiv (prépublication)

The effects of negative adaptation in Model-Agnostic Meta-Learning

Tristan Deleu

The capacity of meta-learning algorithms to quickly adapt to a variety of tasks, including ones they did not experience during meta-training… (voir plus), has been a key factor in the recent success of these methods on few-shot learning problems. This particular advantage of using meta-learning over standard supervised or reinforcement learning is only well founded under the assumption that the adaptation phase does improve the performance of our model on the task of interest. However, in the classical framework of meta-learning, this constraint is only mildly enforced, if not at all, and we only see an improvement on average over a distribution of tasks. In this paper, we show that the adaptation in an algorithm like MAML can significantly decrease the performance of an agent in a meta-reinforcement learning setting, even on a range of meta-training tasks.

2018-12-04

ArXiv (prépublication)

Keep Drawing It: Iterative language-based image generation and editing

Alaaeldin El-Nouby

Shikhar Sharma

Hannes Schulz

R Devon Hjelm

Layla El Asri

S Ebrahimi Kahou

Graham W. Taylor

Conditional text-to-image generation approaches commonly focus on generating a single image in a single step. One practical extension beyond… (voir plus) one-step generation is an interactive system that generates an image iteratively, conditioned on ongoing linguistic input / feedback. This is significantly more challenging as such a system must understand and keep track of the ongoing context and history. In this work, we present a recurrent image generation model which takes into account both the generated output up to the current step as well as all past instructions for generation. We show that our model is able to generate the background, add new objects, apply simple transformations to existing objects, and correct previous mistakes. We believe our approach is an important step toward interactive generation.

2018-11-23

arXiv.org (prépublication)

dblp.uni-trier.de

Interpretable Convolutional Filters with SincNet

Mirco Ravanaelli

Deep learning is currently playing a crucial role toward higher levels of artificial intelligence. This paradigm allows neural networks to l… (voir plus)earn complex and abstract representations, that are progressively obtained by combining simpler ones. Nevertheless, the internal "black-box" representations automatically discovered by current neural architectures often suffer from a lack of interpretability, making of primary interest the study of explainable machine learning techniques. This paper summarizes our recent efforts to develop a more interpretable neural model for directly processing speech from the raw waveform. In particular, we propose SincNet, a novel Convolutional Neural Network (CNN) that encourages the first layer to discover more meaningful filters by exploiting parametrized sinc functions. In contrast to standard CNNs, which learn all the elements of each filter, only low and high cutoff frequencies of band-pass filters are directly learned from data. This inductive bias offers a very compact way to derive a customized filter-bank front-end, that only depends on some parameters with a clear physical meaning. Our experiments, conducted on both speaker and speech recognition, show that the proposed architecture converges faster, performs better, and is more interpretable than standard CNNs.

2018-11-22

ArXiv (prépublication)

On Training Recurrent Neural Networks for Lifelong Learning

Shagun Sodhani

A. Chandar

Catastrophic forgetting and capacity saturation are the central challenges of any parametric lifelong learning system. In this work, we stud… (voir plus)y these challenges in the context of sequential supervised learning with emphasis on recurrent neural networks. To evaluate the models in the lifelong learning setting, we propose a curriculum-based, simple, and intuitive benchmark where the models are trained on tasks with increasing levels of difficulty. To measure the impact of catastrophic forgetting, the model is tested on all the previous tasks as it completes any task. As a step towards developing true lifelong learning systems, we unify Gradient Episodic Memory (a catastrophic forgetting alleviation approach) and Net2Net(a capacity expansion approach). Both these models are proposed in the context of feedforward networks and we evaluate the feasibility of using them for recurrent networks. Evaluation on the proposed benchmark shows that the unified model is more suitable than the constituent models for lifelong learning setting.

2018-11-15

ArXiv (prépublication)

BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

Maxime Chevalier-Boisvert

Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific … (voir plus)reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded language learning. The BabyAI platform comprises an extensible suite of 19 levels of increasing difficulty. The levels gradually lead the agent towards acquiring a combinatorially rich synthetic language which is a proper subset of English. The platform also provides a heuristic expert agent for the purpose of simulating a human teacher. We report baseline results and estimate the amount of human involvement that would be required to train a neural network-based agent on some of the BabyAI levels. We put forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties.

2018-10-17

arXiv.org (prépublication)

dblp.uni-trier.de

Deep Learning. Das umfassende Handbuch

Ian Goodfellow

Aaron Courville

2018-10-09

(publié)

www.semanticscholar.org

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

Zhilin Yang

Peng Qi

Saizheng Zhang

William W. Cohen

Russ Salakhutdinov

Christopher D Manning

Existing question answering (QA) datasets fail to train QA systems to perform complex reasoning and provide explanations for answers. We int… (voir plus)roduce HotpotQA, a new dataset with 113k Wikipedia-based question-answer pairs with four key features: (1) the questions require finding and reasoning over multiple supporting documents to answer; (2) the questions are diverse and not constrained to any pre-existing knowledge bases or knowledge schemas; (3) we provide sentence-level supporting facts required for reasoning, allowing QA systems to reason with strong supervision and explain the predictions; (4) we offer a new type of factoid comparison questions to test QA systems’ ability to extract relevant facts and perform necessary comparison. We show that HotpotQA is challenging for the latest QA systems, and the supporting facts enable models to improve performance and make explainable predictions.

2018-09-30

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (publié)