Aaron Courville

Anirudh Buvanesh

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Laurent Charlin

Razvan Ciuca

Maîtrise recherche - Université de Montréal

Juan Duque

Doctorat - UdeM

Doctorat - UdeM

Doctorat - UdeM

Uday Kapur

Doctorat - UdeM

Amr Khalifa

Doctorat - UdeM

Samuel Lavoie

Doctorat - UdeM

Zhixuan Lin

Doctorat - UdeM

Ahmed Masry

Collaborateur·rice de recherche - N/A

Alan Milligan

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - UdeM

Johan Samir Obando Ceron

Doctorat - UdeM

Co-superviseur⋅e :

Collaborateur·rice de recherche - UdeM

Dereck Piché

Maîtrise recherche - UdeM

Khaled Rouissi

Maîtrise recherche - UdeM

Esra'a Saleh

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Glen Berseth

Vedant Shah

Doctorat - UdeM

Doctorat - UdeM

Yusong Wu

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Anna (Cheng-Zhi) Huang

Sujin yun

Doctorat - UdeM

Dinghuai Zhang

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Hattie Zhou

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Hugo Larochelle

Publications

Unsupervised Dependency Graph Network

Yikang Shen

Shawn Tan

Alessandro Sordoni

Peng Li

Jie Zhou

Recent work has identified properties of pretrained self-attention models that mirror those of dependency parse structures. In particular, s… (voir plus)ome self-attention heads correspond well to individual dependency types. Inspired by these developments, we propose a new competitive mechanism that encourages these attention heads to model different dependency relations. We introduce a new model, the Unsupervised Dependency Graph Network (UDGN), that can induce dependency structures from raw corpora and the masked language modeling task. Experiment results show that UDGN achieves very strong unsupervised dependency parsing performance without gold POS tags and any other external information. The competitive gated heads show a strong correlation with human-annotated dependency types. Furthermore, the UDGN can also achieve competitive performance on masked language modeling and sentence textual similarity tasks.

2022-01-01

Annual Meeting of the Association for Computational Linguistics (published)

Generative Adversarial Networks

Ian G Goodfellow

Mehdi Mirza

Generative Adversarial Networks (GANs) are a type of deep learning techniques that have shown remarkable success in generating realistic ima… (voir plus)ges, videos, and other types of data. This paper provides a comprehensive guide to GANs, covering their architecture, loss functions, training methods, applications, evaluation metrics, challenges, and future directions. We begin with an introduction to GANs and their historical development, followed by a review of the background and related work. We then provide a detailed overview of the GAN architecture, including the generator and discriminator networks, and discuss the key design choices and variations. Next, we review the loss functions utilized in GANs, including the original minimax objective, as well as more recent approaches s.a. Wasserstein distance and gradient penalty. We then delve into the training of GANs, discussing common techniques s.a. alternating optimization, minibatch discrimination, and spectral normalization. We also provide a survey of the various applications of GANs across domains. In addition, we review the evaluation metrics utilized to assess the diversity and quality of GAN-produced data. Furthermore, we discuss the challenges and open issues in GANs, including mode collapse, training instability, and ethical considerations. Finally, we provide a glimpse into the future directions of GAN research, including improving scalability, developing new architectures, incorporating domain knowledge, and exploring new applications. Overall, this paper serves as a comprehensive guide to GANs, providing both theoretical and practical insights for researchers and practitioners in the field.

2021-12-15

ArXiv (prépublication)

Multi-label Iterated Learning for Image Classification with Label Ambiguity

Sai Rajeswar

Pau Rodriguez

Soumye Singhal

David Vázquez

Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that da… (voir plus)tasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data. Inspired by language emergence literature, we propose multi-label iterated learning (MILe) to incorporate the inductive biases of multi-label learning from single labels using the framework of iterated learning. MILe is a simple yet effective procedure that builds a multi-label description of the image by propagating binary predictions through successive generations of teacher and student networks with a learning bottleneck. Experiments show that our approach exhibits systematic benefits on ImageNet accuracy as well as ReaL F1 score, which indicates that MILe deals better with label ambiguity than the standard training procedure, even when fine-tuning from self-supervised weights. We also show that MILe is effective reducing label noise, achieving state-of-the-art performance on real-world large-scale noisy data such as WebVision. Furthermore, MILe improves performance in class incremental settings such as IIRC and it is robust to distribution shifts. Code: https://github.com/rajeswar18/MILe

2021-11-23

ArXiv (preprint)

Gradient Starvation: A Learning Proclivity in Neural Networks

Sékou-Oumar Kaba

We identify and formalize a fundamental gradient descent phenomenon resulting in a learning proclivity in over-parameterized neural networks… (voir plus). Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task, despite the presence of other predictive features that fail to be discovered. This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks. Using tools from Dynamical Systems theory, we identify simple properties of learning dynamics during gradient descent that lead to this imbalance, and prove that such a situation can be expected given certain statistical structure in training data. Based on our proposed formalism, we develop guarantees for a novel regularization method aimed at decoupling feature learning dynamics, improving accuracy and robustness in cases hindered by gradient starvation. We illustrate our findings with simple and real-world out-of-distribution (OOD) generalization experiments.

openreview.net

Generative Compositional Augmentations for Scene Graph Prediction

Boris Knyazev

Harm de Vries

Cătălina Cangea

Graham W. Taylor

Eugene Belilovsky

Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of v… (voir plus)ision and language. We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution. Current scene graph generation models are trained on a tiny fraction of the distribution corresponding to the most frequent compositions, e.g. . However, test images might contain zero- and few-shot compositions of objects and relationships, e.g. . Despite each of the object categories and the predicate (e.g. ‘on’) being frequent in the training data, the models often fail to properly understand such unseen or rare compositions. To improve generalization, it is natural to attempt increasing the diversity of the training distribution. However, in the graph domain this is non-trivial. To that end, we propose a method to synthesize rare yet plausible scene graphs by perturbing real ones. We then propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs and learn from them in a joint fashion. When evaluated on the Visual Genome dataset, our approach yields marginal, but consistent improvements in zero- and few-shot metrics. We analyze the limitations of our approach indicating promising directions for future research.

2021-10-10

2021 IEEE/CVF International Conference on Computer Vision (ICCV) (publié)

Haptics-based Curiosity for Sparse-reward Tasks

Sai Rajeswar

Cyril Ibrahim

Nitin Surya

Florian Golemo

David Vázquez

Pedro O. Pinheiro

Robots in many real-world settings have access to force/torque sensors in their gripper and tactile sensing is often necessary for tasks tha… (voir plus)t involve contact-rich motion. In this work, we leverage surprise from mismatches in haptics feedback to guide exploration in hard sparse-reward reinforcement learning tasks. Our approach, Haptics-based Curiosity (\method{}), learns what visible objects interactions are supposed to ``feel" like. We encourage exploration by rewarding interactions where the expectation and the experience do not match. We test our approach on a range of haptics-intensive robot arm tasks (e.g. pushing objects, opening doors), which we also release as part of this work. Across multiple experiments in a simulated setting, we demonstrate that our method is able to learn these difficult tasks through sparse reward and curiosity alone. We compare our cross-modal approach to single-modality (haptics- or vision-only) approaches as well as other curiosity-based methods and find that our method performs better and is more sample-efficient.

2021-09-13

robot-learning.org/CoRL/2021/Conference (poster)

proceedings.mlr.press

openreview.net

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Marc Gendron-Bellemare

Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing their relative performance on a large suite of tasks. M… (voir plus)ost published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs. Beginning with the Arcade Learning Environment (ALE), the shift towards computationally-demanding benchmarks has led to the practice of evaluating only a small number of runs per task, exacerbating the statistical uncertainty in point estimates. In this paper, we argue that reliable evaluation in the few run deep RL regime cannot ignore the uncertainty in results without running the risk of slowing down progress in the field. We illustrate this point using a case study on the Atari 100k benchmark, where we find substantial discrepancies between conclusions drawn from point estimates alone versus a more thorough statistical analysis. With the aim of increasing the field's confidence in reported results with a handful of runs, we advocate for reporting interval estimates of aggregate performance and propose performance profiles to account for the variability in results, as well as present more robust and efficient aggregate metrics, such as interquartile mean scores, to achieve small uncertainty in results. Using such statistical tools, we scrutinize performance evaluations of existing algorithms on other widely used RL benchmarks including the ALE, Procgen, and the DeepMind Control Suite, again revealing discrepancies in prior comparisons. Our findings call for a change in how we evaluate performance in deep RL, for which we present a more rigorous evaluation methodology, accompanied with an open-source library rliable, to prevent unreliable results from stagnating the field.

2021-08-30

ArXiv (prépublication)

StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling

Yikang Shen

Yi Tay

Che Zheng

Dara Bahri

Donald Metzler

2021-08-01

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (publié)

openreview.net

Continuous Coordination As a Realistic Scenario for Lifelong Learning

Hadi Nekoei

Akilesh Badrinaaraayanan

Sarath Chandar

Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. L… (voir plus)ifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works. The code and all pre-trained models are available at https://github.com/chandar-lab/Lifelong-Hanabi.

2021-07-01

Proceedings of the 38th International Conference on Machine Learning (publié)

proceedings.mlr.press

Hierarchical Video Generation for Complex Data

Lluis Castrejon

Nicolas Ballas

2021-06-04

ArXiv (prépublication)

Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle

Syntax is fundamental to our thinking about language. Failing to capture the structure of input language could lead to generalization proble… (voir plus)ms and over-parametrization. In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM). The model explicitly models the structure with an incremental parser and maintains the conditional probability setting of a standard language model (left-to-right). To train the incremental parser and avoid exposure bias, we also propose a novel dynamic oracle, so that SOM is more robust to wrong parsing decisions. Experiments show that SOM can achieve strong results in language modeling, incremental parsing, and syntactic generalization tests while using fewer parameters than other models.

2021-06-01

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (publié)

Understanding by Understanding Not: Modeling Negation in Language Models

Negation is a core construction in natural language. Despite being very successful on many tasks, state-of-the-art pre-trained language mode… (voir plus)ls often handle negation incorrectly. To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus. By training BERT with the resulting combined objective we reduce the mean top 1 error rate to 4% on the negated LAMA dataset. We also see some improvements on the negated NLI benchmarks.

2021-06-01

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (publié)