Aaron Courville

alexandre.ganito@mila.quebec

adrien.taiga@mila.quebec

Alexandre Diz Ganito

Master's Research - Université de Montréal

Amr Khalifa

PhD - Université de Montréal

amr-khalifa.alshaykh@mila.quebec

andrei.nicolicioiu@mila.quebec

Andrei Nicolicioiu

PhD - Université de Montréal

ching-lam.choi@mila.quebec

Ankit Vani

PhD - Université de Montréal

Undergraduate - Université de Montréal

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Co-supervisor :

Yoshua Bengio

dinghuai.zhang@mila.quebec

Esra'a Saleh

PhD - Université de Montréal

Principal supervisor :

Glen Berseth

esraa.saleh@mila.quebec

evgenii.nikishin@mila.quebec

Evgenii Nikishin

PhD - Université de Montréal

Principal supervisor :

Pierre-Luc Bacon

Faruk Ahmed

PhD - Université de Montréal

faruk.ahmed@mila.quebec

Johan Samir Obando Ceron

PhD - Université de Montréal

Co-supervisor :

Pablo Samuel Castro

johan.ceron@mila.quebec

Juan Duque

PhD - Université de Montréal

juan.duque@mila.quebec

manoosh.samiei@mila.quebec

Manoosh Samiei

PhD - Université de Montréal

PhD - Université de Montréal

Co-supervisor :

mohammed.muqeeth@mila.quebec

schwarzm@mila.quebec

Hattie Zhou

PhD - Université de Montréal

Principal supervisor :

Hugo Larochelle

mengfei.zhou@mila.quebec

Michael Noukhovitch

PhD - Université de Montréal

Collaborating researcher

morgane.moss@mila.quebec

Muqeeth Mohammed

Collaborating researcher - Université de Montréal

Research Intern - Ghent University

pietro.mazzaglia@mila.quebec

Research Intern - Université de Montréal

razvan.ciuca@mila.quebec

Rishabh Agarwal

PhD - Université de Montréal

Co-supervisor :

samuel.lavoie@mila.quebec

Samuel Lavoie Marchildon

PhD - Université de Montréal

Sarvjeet Ghotra

Master's Research - Université de Montréal

Principal supervisor :

Aishwarya Agrawal

sarvjeet-singh.ghotra@mila.quebec

xiaofeng.zhang@mila.quebec

Arian Hosseini

PhD - Université de Montréal

Shawn Tan

PhD - Université de Montréal

PhD - Université de Montréal

PhD - Université de Montréal

Yusong Wu

PhD - Université de Montréal

Principal supervisor :

Anna (Cheng-Zhi) Huang

wu.yusong@mila.quebec

Zhixuan Lin

PhD - Université de Montréal

zhixuan.lin@mila.quebec

Publications

Invariant representation driven neural classifier for anti-QCD jet tagging

Taoli Cheng

2022-10-24

Journal of High Energy Physics (published)

VIM: Variational Independent Modules for Video Prediction

Rim Assouel

Lluis Castrejon

Nicolas Ballas

Yoshua Bengio

We introduce a variational inference model called VIM, for Variational Independent Modules, for sequential data that learns and infers laten… (see more)t representations as a set of objects and discovers modular causal mechanisms over these objects. These mechanisms - which we call modules - are independently parametrized, define the stochastic transitions of entities and are shared across entities. At each time step, our model infers from a low-level input sequence a high-level sequence of categorical latent variables to select which transition modules to apply to which high-level object. We evaluate this model in video prediction tasks where the goal is to predict multi-modal future events given previous observations. We demonstrate empirically that VIM can model 2D visual sequences in an interpretable way and is able to identify the underlying dynamically instantiated mechanisms of the generation process. We additionally show that the learnt modules can be composed at test time to generalize to out-of-distribution observations.

2022-06-28

Proceedings of the First Conference on Causal Learning and Reasoning (published)

proceedings.mlr.press

openreview.net

Multi-label Iterated Learning for Image Classification with Label Ambiguity

Sai Rajeswar

Pau Rodriguez

Soumye Singhal

David Vazquez

Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that da… (see more)tasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data. Inspired by language emergence literature, we propose multi-label iterated learning (MILe) to incorporate the inductive biases of multi-label learning from single labels using the framework of iterated learning. MILe is a simple yet effective procedure that builds a multi-label description of the image by propagating binary predictions through successive generations of teacher and student networks with a learning bottleneck. Experiments show that our approach exhibits systematic benefits on ImageNet accuracy as well as ReaL F1 score, which indicates that MILe deals better with label ambiguity than the standard training procedure, even when fine-tuning from self-supervised weights. We also show that MILe is effective reducing label noise, achieving state-of-the-art performance on real-world large-scale noisy data such as WebVision. Furthermore, MILe improves performance in class incremental settings such as IIRC and it is robust to distribution shifts. Code: https://github.com/rajeswar18/MILe

2022-06-18

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

Unsupervised Dependency Graph Network

Yikang Shen

Shawn Tan

Alessandro Sordoni

Peng Li

Jie Zhou

2022-05-01

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (published)

Fortuitous Forgetting in Connectionist Networks

Hattie Zhou

Ankit Vani

Hugo Larochelle

Forgetting is often seen as an unwanted characteristic in both human and machine learning. However, we propose that forgetting can in fact b… (see more)e favorable to learning. We introduce"forget-and-relearn"as a powerful paradigm for shaping the learning trajectories of artificial neural networks. In this process, the forgetting step selectively removes undesirable information from the model, and the relearning step reinforces features that are consistently useful under different conditions. The forget-and-relearn framework unifies many existing iterative training algorithms in the image classification and language emergence literature, and allows us to understand the success of these algorithms in terms of the disproportionate forgetting of undesirable information. We leverage this understanding to improve upon existing algorithms by designing more targeted forgetting operations. Insights from our analysis provide a coherent view on the dynamics of iterative training in neural networks and offer a clear path towards performance improvements.

2022-01-28

ICLR.cc/2022/Conference (poster)

openreview.net

Invariant representation driven neural classifier for anti-QCD jet tagging

Taoli Cheng

2022-01-18

ArXiv (preprint)

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

Rishabh Agarwal

Max Schwarzer

Pablo Samuel Castro

openreview.net

Unsupervised Dependency Graph Network

Yikang Shen

Shawn Tan

Alessandro Sordoni

Peng Li

Jie Zhou

Recent work has identified properties of pretrained self-attention models that mirror those of dependency parse structures. In particular, s… (see more)ome self-attention heads correspond well to individual dependency types. Inspired by these developments, we propose a new competitive mechanism that encourages these attention heads to model different dependency relations. We introduce a new model, the Unsupervised Dependency Graph Network (UDGN), that can induce dependency structures from raw corpora and the masked language modeling task. Experiment results show that UDGN achieves very strong unsupervised dependency parsing performance without gold POS tags and any other external information. The competitive gated heads show a strong correlation with human-annotated dependency types. Furthermore, the UDGN can also achieve competitive performance on masked language modeling and sentence textual similarity tasks.

2022-01-01

Annual Meeting of the Association for Computational Linguistics (published)

Multi-label Iterated Learning for Image Classification with Label Ambiguity

Sai Rajeswar

Pau Rodriguez

Soumye Singhal

David Vazquez

2021-11-23

ArXiv (preprint)

Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle

Yikang Shen

Shawn Tan

Alessandro Sordoni

Siva Reddy

Syntax is fundamental to our thinking about language. Failing to capture the structure of input language could lead to generalization proble… (see more)ms and over-parametrization. In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM). The model explicitly models the structure with an incremental parser and maintains the conditional probability setting of a standard language model (left-to-right). To train the incremental parser and avoid exposure bias, we also propose a novel dynamic oracle, so that SOM is more robust to wrong parsing decisions. Experiments show that SOM can achieve strong results in language modeling, incremental parsing, and syntactic generalization tests while using fewer parameters than other models.

2021-06-01

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (published)

Understanding by Understanding Not: Modeling Negation in Language Models

Arian Hosseini

Negation is a core construction in natural language. Despite being very successful on many tasks, state-of-the-art pre-trained language mode… (see more)ls often handle negation incorrectly. To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus. By training BERT with the resulting combined objective we reduce the mean top 1 error rate to 4% on the negated LAMA dataset. We also see some improvements on the negated NLI benchmarks.

2021-06-01

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (published)

DATA-EFFICIENT REINFORCEMENT LEARNING

Nitarshan Rajkumar

Michael Noukhovitch

Ankesh Anand

Laurent Charlin

(Rex) Devon Hjelm

Philip Bachman