Johan Samir Obando Ceron

Doctorat - UdeM

Superviseur⋅e principal⋅e

Aaron Courville

Co-supervisor

Pablo Samuel Castro

Sujets de recherche

Apprentissage par renforcement

Apprentissage profond

Site web

Google Scholar

GitHub

Publications

Bigger, Better, Faster: Human-level Atari with human-level efficiency

Max Schwarzer

Johan Samir Obando Ceron

Aaron Courville

Marc Gendron-Bellemare

Rishabh Agarwal

Pablo Samuel Castro

We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark. BBF relies on sca… (voir plus)ling the neural networks used for value estimation, as well as a number of other design choices that enable this scaling in a sample-efficient manner. We conduct extensive analyses of these design choices and provide insights for future work. We end with a discussion about updating the goalposts for sample-efficient RL research on the ALE. We make our code and data publicly available at https://github.com/google-research/google-research/tree/master/bigger_better_faster.

2023-01-01

ICML (publié)

doi.org

openreview.net

The Small Batch Size Anomaly in Multistep Deep Reinforcement Learning

Johan Samir Obando Ceron

Marc Gendron-Bellemare

Pablo Samuel Castro

2023-01-01

Tiny Papers @ ICLR (published)

openreview.net

Variance Double-Down: The Small Batch Size Anomaly in Multistep Deep Reinforcement Learning

Johan Samir Obando Ceron

Marc Gendron-Bellemare

Pablo Samuel Castro

In deep reinforcement learning, multi-step learning is almost unavoidable to achieve state-of-the-art performance. However, the increased va… (voir plus)riance that multistep learning brings makes it difficult to increase the update horizon beyond relatively small numbers. In this paper, we report the counterintuitive finding that decreasing the batch size parameter improves the performance of many standard deep RL agents that use multi-step learning. It is well-known that gradient variance decreases with increasing batch sizes, so obtaining improved performance by increasing variance on two fronts is a rather surprising finding. We conduct a broad set of experiments to better understand what we call the variance doubledown phenomenon.

2022-12-09

NeurIPS.cc/2022/Workshop/DeepRL (inconnu)

openreview.net

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Éclaireurs autochtones en IA

Avantage IA

Johan Samir Obando Ceron

Publications

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Éclaireurs autochtones en IA

Avantage IA

Mots-clés populaires:

Johan Samir Obando Ceron

Publications