Portrait de Pablo Samuel Castro

Pablo Samuel Castro

Membre industriel principal
Professeur associé, Université de Montréal, Département d'informatique et de recherche opérationnelle
Chercheur scientifique, Google DeepMind

Biographie

Pablo Samuel Castro est né et a grandi à Quito, en Équateur, et a déménagé à Montréal après l'école secondaire pour étudier à l’Université McGill. Il y a obtenu un doctorat en se concentrant sur l'apprentissage par renforcement, sous la supervision de Doina Precup et Prakash Panangaden. Il est chercheur scientifique à Google DeepMind à Montréal. Il s’intéresse particulièrement à la recherche fondamentale sur l'apprentissage par renforcement et plaide régulièrement en faveur d'une augmentation de la représentation des personnes d’origine latino-américaine dans la communauté de recherche. Il est également professeur adjoint au Département d'informatique et de recherche opérationnelle (DIRO) de l'Université de Montréal. Outre son intérêt pour le codage, l'intelligence artificielle et les mathématiques, Pablo Samuel est un musicien actif.

Étudiants actuels

Publications

A Geometric Perspective on Optimal Representations for Reinforcement Learning
Will Dabney
Robert Dadashi
Adrien Ali Taiga
Dale Schuurmans
Tor Lattimore
Clare Lyle
We propose a new perspective on representation learning in reinforcement learning based on geometric properties of the space of value functi… (voir plus)ons. We leverage this perspective to provide formal evidence regarding the usefulness of value functions as auxiliary tasks. Our formulation considers adapting the representation to minimize the (linear) approximation of the value function of all stationary policies for a given environment. We show that this optimization reduces to making accurate predictions regarding a special class of value functions which we call adversarial value functions (AVFs). We demonstrate that using value functions as auxiliary tasks corresponds to an expected-error relaxation of our formulation, with AVFs a natural candidate, and identify a close relationship with proto-value functions (Mahadevan, 2005). We highlight characteristics of AVFs and their usefulness as auxiliary tasks in a series of experiments on the four-room domain.
Dopamine: A Research Framework for Deep Reinforcement Learning
Subhodeep Moitra
Carles Gelada
Saurabh Kumar
Deep reinforcement learning (deep RL) research has grown significantly in recent years. A number of software offerings now exist that provid… (voir plus)e stable, comprehensive implementations for benchmarking. At the same time, recent deep RL research has become more diverse in its goals. In this paper we introduce Dopamine, a new research framework for deep RL that aims to support some of that diversity. Dopamine is open-source, TensorFlow-based, and provides compact and reliable implementations of some state-of-the-art deep RL agents. We complement this offering with a taxonomy of the different research objectives in deep RL research. While by no means exhaustive, our analysis highlights the heterogeneity of research in the field, and the value of frameworks such as ours.