Soumye Singhal

Multi-label Iterated Learning for Image Classification with Label Ambiguity

Sai Rajeswar

Pau Rodríguez

Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that da… (see more)tasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data. Inspired by language emergence literature, we propose multi-label iterated learning (MILe) to incorporate the inductive biases of multi-label learning from single labels using the framework of iterated learning. MILe is a simple yet effective procedure that builds a multi-label description of the image by propagating binary predictions through successive generations of teacher and student networks with a learning bottleneck. Experiments show that our approach exhibits systematic benefits on ImageNet accuracy as well as ReaL F1 score, which indicates that MILe deals better with label ambiguity than the standard training procedure, even when fine-tuning from self-supervised weights. We also show that MILe is effective reducing label noise, achieving state-of-the-art performance on real-world large-scale noisy data such as WebVision. Furthermore, MILe improves performance in class incremental settings such as IIRC and it is robust to distribution shifts. Code: https://github.com/rajeswar18/MILe

2022-06-17

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

Countering Language Drift with Seeded Iterated Learning

Yuchen Lu

Soumye Singhal

Florian Strub

Olivier Pietquin

Aaron Courville

Pretraining on human corpus and then finetuning in a simulator has become a standard pipeline for training a goal-oriented dialogue agent. N… (see more)evertheless, as soon as the agents are finetuned to maximize task completion, they suffer from the so-called language drift phenomenon: they slowly lose syntactic and semantic properties of language as they only focus on solving the task. In this paper, we propose a generic approach to counter language drift called Seeded iterated learning (SIL). We periodically refine a pretrained student agent by imitating data sampled from a newly generated teacher agent. At each time step, the teacher is created by copying the student agent, before being finetuned to maximize task completion. SIL does not require external syntactic constraint nor semantic knowledge, making it a valuable task-agnostic finetuning protocol. We evaluate SIL in a toy-setting Lewis Game, and then scale it up to the translation game with natural language. In both settings, SIL helps counter language drift as well as it improves the task completion compared to baselines.

2020-11-20

Proceedings of the 37th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

Supervised Seeded Iterated Learning for Interactive Language Learning

Yuchen Lu

Soumye Singhal

Florian Strub

Olivier Pietquin

Aaron Courville

2020-10-31

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (published)

doi.org

arxiv.org

Recall Traces: Backtracking Models for Efficient Reinforcement Learning

Anirudh Goyal

Philemon Brakel

William Fedus

Soumye Singhal

Timothy Lillicrap

Sergey Levine

Hugo Larochelle

Yoshua Bengio

In many environments only a tiny subset of all states yield high reward. In these cases, few of the interactions with the environment provid… (see more)e a relevant learning signal. Hence, we may want to preferentially train on those high-reward states and the probable trajectories leading to them. To this end, we advocate for the use of a backtracking model that predicts the preceding states that terminate at a given high-reward state. We can train a model which, starting from a high value state (or one that is estimated to have high value), predicts and sample for which the (state, action)-tuples may have led to that high value state. These traces of (state, action) pairs, which we refer to as Recall Traces, sampled from this backtracking model starting from a high value state, are informative as they terminate in good states, and hence we can use these traces to improve a policy. We provide a variational interpretation for this idea and a practical algorithm in which the backtracking model samples from an approximate posterior distribution over trajectories which lead to large rewards. Our method improves the sample efficiency of both on- and off-policy RL algorithms across several environments and tasks.

2018-12-31

International Conference on Learning Representations (poster)

doi.org

openreview.net

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Soumye Singhal

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Soumye Singhal

Publications