2021-12
Neural Algorithmic Reasoners are Implicit Planners
2021-06
Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation.
2021-05
XLVIN: eXecuted Latent Value Iteration Nets
2021-03
An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning.
2021-01
TDprop: Does Adaptive Optimization With Jacobi Preconditioning Help Temporal Difference Learning?
2020-09
Graph neural induction of value iteration.
2020-07
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?
2020-04
Options of Interest: Temporal Abstraction with Interest Functions
2020-02
Policy Evaluation Networks.
2019-12
Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
2018-11
The Barbados 2018 List of Open Issues in Continual Learning.
2018-07
Convergent Tree Backup and Retrace with Function Approximation
2018-03
Constructing Temporal Abstractions Autonomously in Reinforcement Learning
2018-02
Learning with Options that Terminate Off-Policy
When Waiting is not an Option : Learning Options with a Deliberation Cost
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
2018-01
Learning Robust Options.
Publications collected and formatted using Paperoni