Portrait de Geoff Gordon

Geoff Gordon

Membre affilié
Professeur, Carnegie Mellon University, Département de l'apprentissage automatique

Biographie

Geoffrey Gordon est professeur au Département d'apprentissage automatique de l'Université Carnegie Mellon, où il a également occupé les postes de directeur de département par intérim et de directeur de département associé à l'éducation.

Ses recherches ont porté sur les systèmes d’intelligence artificielle capables de penser à long terme, notamment afin de réfléchir à l’avance pour résoudre un problème, de planifier une séquence d’actions ou de déduire des propriétés invisibles à partir d’observations. Plus particulièrement, il explore la façon de combiner l'apprentissage automatique avec ces tâches de réflexion à long terme.

Geoffrey Gordon a obtenu un baccalauréat en informatique de l’Université Cornell en 1991 et un doctorat en informatique de l’Université Carnegie Mellon en 1999. Ses intérêts de recherche comprennent l’intelligence artificielle, l’apprentissage statistique, les données pédagogiques, la théorie des jeux, les systèmes multirobots et les domaines à somme générale. Auparavant, il a été professeur invité au Département de science informatique de l’Université Stanford et scientifique principal à Burning Glass Technologies, à San Diego.

Publications

A Reduction from Reinforcement Learning to No-Regret Online Learning
Ching-An Cheng
Remi Tachet des Combes
Byron Boots
We present a reduction from reinforcement learning (RL) to no-regret online learning based on the saddle-point formulation of RL, by which "… (voir plus)any" online algorithm with sublinear regret can generate policies with provable performance guarantees. This new perspective decouples the RL problem into two parts: regret minimization and function approximation. The first part admits a standard online-learning analysis, and the second part can be quantified independently of the learning algorithm. Therefore, the proposed reduction can be used as a tool to systematically design new RL algorithms. We demonstrate this idea by devising a simple RL algorithm based on mirror descent and the generative-model oracle. For any
Expressiveness and Learning of Hidden Quantum Markov Models
Sandesh M. Adhikary
Siddarth Srinivasan
Byron Boots
Extending classical probabilistic reasoning using the quantum mechanical view of probability has been of recent interest, particularly in th… (voir plus)e development of hidden quantum Markov models (HQMMs) to model stochastic processes. However, there has been little progress in characterizing the expressiveness of such models and learning them from data. We tackle these problems by showing that HQMMs are a special subclass of the general class of observable operator models (OOMs) that do not suffer from the \emph{negative probability problem} by design. We also provide a feasible retraction-based learning algorithm for HQMMs using constrained gradient descent on the Stiefel manifold of model parameters. We demonstrate that this approach is faster and scales to larger models than previous learning algorithms.
Expressiveness and Learning of Hidden Quantum Markov Models
Sandesh M. Adhikary
Siddarth Srinivasan
Byron Boots
Extending classical probabilistic reasoning using the quantum mechanical view of probability has been of recent interest, particularly in th… (voir plus)e development of hidden quantum Markov models (HQMMs) to model stochastic processes. However, there has been little progress in characterizing the expressiveness of such models and learning them from data. We tackle these problems by showing that HQMMs are a special subclass of the general class of observable operator models (OOMs) that do not suffer from the \emph{negative probability problem} by design. We also provide a feasible retraction-based learning algorithm for HQMMs using constrained gradient descent on the Stiefel manifold of model parameters. We demonstrate that this approach is faster and scales to larger models than previous learning algorithms.
A Reduction from Reinforcement Learning to No-Regret Online Learning
Ching-An Cheng
Remi Tachet des Combes
Byron Boots
We present a reduction from reinforcement learning (RL) to no-regret online learning based on the saddle-point formulation of RL, by which "… (voir plus)any" online algorithm with sublinear regret can generate policies with provable performance guarantees. This new perspective decouples the RL problem into two parts: regret minimization and function approximation. The first part admits a standard online-learning analysis, and the second part can be quantified independently of the learning algorithm. Therefore, the proposed reduction can be used as a tool to systematically design new RL algorithms. We demonstrate this idea by devising a simple RL algorithm based on mirror descent and the generative-model oracle. For any