Chitwan Saharia

Combating False Negatives in Adversarial Imitation Learning

Léonard Boussioux

David Y. T. Hui

Maxime Chevalier-Boisvert

In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the de… (voir plus)sired behavior. However, as the trained policy learns to be more successful, the negative examples (the ones produced by the agent) become increasingly similar to expert ones. Despite the fact that the task is successfully accomplished in some of the agent's trajectories, the discriminator is trained to output low values for them. We hypothesize that this inconsistent training signal for the discriminator can impede its learning, and consequently leads to worse overall performance of the agent. We show experimental evidence for this hypothesis and that the ‘False Negatives’ (i.e. successful agent episodes) significantly hinder adversarial imitation learning, which is the first contribution of this paper. Then, we propose a method to alleviate the impact of false negatives and test it on the BabyAI environment. This method consistently improves sample efficiency over the baselines by at least an order of magnitude.

2021-07-17

2021 International Joint Conference on Neural Networks (IJCNN) (publié)

doi.org

arxiv.org

Combating False Negatives in Adversarial Imitation Learning (Student Abstract)

Konrad Żołna

Chitwan Saharia

Léonard Boussioux

David Y. T. Hui

Maxime Chevalier-Boisvert

Dzmitry Bahdanau

Yoshua Bengio

2020-04-02

AAAI Conference on Artificial Intelligence (publié)

doi.org

BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning

Maxime Chevalier-Boisvert

Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific … (voir plus)reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded language learning. The BabyAI platform comprises an extensible suite of 19 levels of increasing difficulty. The levels gradually lead the agent towards acquiring a combinatorially rich synthetic language which is a proper subset of English. The platform also provides a heuristic expert agent for the purpose of simulating a human teacher. We report baseline results and estimate the amount of human involvement that would be required to train a neural network-based agent on some of the BabyAI levels. We put forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties.

2018-12-31

ICLR.cc/2019/Conference (poster)

openreview.net

BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

Maxime Chevalier-Boisvert

Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific … (voir plus)reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded language learning. The BabyAI platform comprises an extensible suite of 19 levels of increasing difficulty. The levels gradually lead the agent towards acquiring a combinatorially rich synthetic language which is a proper subset of English. The platform also provides a heuristic expert agent for the purpose of simulating a human teacher. We report baseline results and estimate the amount of human involvement that would be required to train a neural network-based agent on some of the BabyAI levels. We put forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties.

2018-10-17

arXiv.org (prépublication)

dblp.uni-trier.de

Mila Techaide 2026

Désinformation 2.0 : quand l’IA brouille nos ondes

Avantage IA : productivité dans la fonction publique

Chitwan Saharia

Publications

Mila Techaide 2026

Désinformation 2.0 : quand l’IA brouille nos ondes

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Chitwan Saharia

Publications