Tristan Deleu

Alumni

Site web

Google Scholar

Publications

Learning Powerful Policies by Using Consistent Dynamics Model

Sergey Levine

Model-based Reinforcement Learning approaches have the promise of being sample efficient. Much of the progress in learning dynamics models i… (voir plus)n RL has been made by learning models via supervised learning. But traditional model-based approaches lead to `compounding errors' when the model is unrolled step by step. Essentially, the state transitions that the learner predicts (by unrolling the model for multiple steps) and the state transitions that the learner experiences (by acting in the environment) may not be consistent. There is enough evidence that humans build a model of the environment, not only by observing the environment but also by interacting with the environment. Interaction with the environment allows humans to carry out experiments: taking actions that help uncover true causal relationships which can be used for building better dynamics models. Analogously, we would expect such interactions to be helpful for a learning agent while learning to model the environment dynamics. In this paper, we build upon this intuition by using an auxiliary cost function to ensure consistency between what the agent observes (by acting in the real world) and what it imagines (by acting in the `learned' world). We consider several tasks - Mujoco based control tasks and Atari games - and show that the proposed approach helps to train powerful policies and better dynamics models.

2019-06-11

ArXiv (prépublication)

arxiv.org

The effects of negative adaptation in Model-Agnostic Meta-Learning

Tristan Deleu

Yoshua Bengio

The capacity of meta-learning algorithms to quickly adapt to a variety of tasks, including ones they did not experience during meta-training… (voir plus), has been a key factor in the recent success of these methods on few-shot learning problems. This particular advantage of using meta-learning over standard supervised or reinforcement learning is only well founded under the assumption that the adaptation phase does improve the performance of our model on the task of interest. However, in the classical framework of meta-learning, this constraint is only mildly enforced, if not at all, and we only see an improvement on average over a distribution of tasks. In this paper, we show that the adaptation in an algorithm like MAML can significantly decrease the performance of an agent in a meta-reinforcement learning setting, even on a range of meta-training tasks.

2018-12-05

ArXiv (prépublication)

arxiv.org

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Communauté de pratique de Mila : Sécurité en IA

Éclaireurs autochtones en IA

Avantage IA

Tristan Deleu

Publications

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Communauté de pratique de Mila : Sécurité en IA

Éclaireurs autochtones en IA

Avantage IA

Mots-clés populaires:

Tristan Deleu

Publications