Geoff Gordon

A Reduction from Reinforcement Learning to No-Regret Online Learning

Ching-An Cheng

Remi Tachet des Combes

Byron Boots

We present a reduction from reinforcement learning (RL) to no-regret online learning based on the saddle-point formulation of RL, by which "… (see more)any" online algorithm with sublinear regret can generate policies with provable performance guarantees. This new perspective decouples the RL problem into two parts: regret minimization and function approximation. The first part admits a standard online-learning analysis, and the second part can be quantified independently of the learning algorithm. Therefore, the proposed reduction can be used as a tool to systematically design new RL algorithms. We demonstrate this idea by devising a simple RL algorithm based on mirror descent and the generative-model oracle. For any

2020-06-03

Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (published)

proceedings.mlr.press

arxiv.org

Expressiveness and Learning of Hidden Quantum Markov Models

Sandesh M. Adhikary

Siddarth Srinivasan

Geoff Gordon

Byron Boots

Extending classical probabilistic reasoning using the quantum mechanical view of probability has been of recent interest, particularly in th… (see more)e development of hidden quantum Markov models (HQMMs) to model stochastic processes. However, there has been little progress in characterizing the expressiveness of such models and learning them from data. We tackle these problems by showing that HQMMs are a special subclass of the general class of observable operator models (OOMs) that do not suffer from the \emph{negative probability problem} by design. We also provide a feasible retraction-based learning algorithm for HQMMs using constrained gradient descent on the Stiefel manifold of model parameters. We demonstrate that this approach is faster and scales to larger models than previous learning algorithms.

2020-01-01

AISTATS (published)

proceedings.mlr.press

arxiv.org

Expressiveness and Learning of Hidden Quantum Markov Models

Sandesh M. Adhikary

Siddarth Srinivasan

Geoff Gordon

Byron Boots

Extending classical probabilistic reasoning using the quantum mechanical view of probability has been of recent interest, particularly in th… (see more)e development of hidden quantum Markov models (HQMMs) to model stochastic processes. However, there has been little progress in characterizing the expressiveness of such models and learning them from data. We tackle these problems by showing that HQMMs are a special subclass of the general class of observable operator models (OOMs) that do not suffer from the \emph{negative probability problem} by design. We also provide a feasible retraction-based learning algorithm for HQMMs using constrained gradient descent on the Stiefel manifold of model parameters. We demonstrate that this approach is faster and scales to larger models than previous learning algorithms.

2019-12-01

ArXiv (preprint)

arxiv.org

A Reduction from Reinforcement Learning to No-Regret Online Learning

Ching-An Cheng

Remi Tachet des Combes

Byron Boots

Geoff Gordon

We present a reduction from reinforcement learning (RL) to no-regret online learning based on the saddle-point formulation of RL, by which "… (see more)any" online algorithm with sublinear regret can generate policies with provable performance guarantees. This new perspective decouples the RL problem into two parts: regret minimization and function approximation. The first part admits a standard online-learning analysis, and the second part can be quantified independently of the learning algorithm. Therefore, the proposed reduction can be used as a tool to systematically design new RL algorithms. We demonstrate this idea by devising a simple RL algorithm based on mirror descent and the generative-model oracle. For any

2019-11-01

ArXiv (preprint)

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Geoff Gordon

Biography

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Geoff Gordon

Biography

Publications