Randy Lefebvre

Optimal discounting for offline input-driven MDP

Offline reinforcement learning has gained a lot of popularity for its potential to solve industry challenges. However, real-world environmen… (voir plus)ts are often highly stochastic and partially observable, leading long-term planners to overfit to offline data in model-based settings. Input-driven Markov Decision Processes (IDMDPs) offer a way to work with some of the uncertainty by letting designers separate what the agent has control over (states) from what it cannot (inputs) in the environnement. These stochastic external inputs are often difficult to model. Under the assumption that the input model will be imperfect, we investigate the bias-variance tradeoff under shallow planning in IDMDPs. Paving the way to input-driven planning horizons, we also investigate the similarity of optimal planning horizons at different inputs given the structure of the input space.

2025-05-09

rl-conference.cc/RLC/2025/Conference (accepté)

openreview.net

Optimal discounting for offline input-driven MDP

Randy Lefebvre

Audrey Durand

Offline reinforcement learning has gained a lot of popularity for its potential to solve industry challenges. However, real-world environmen… (voir plus)ts are often highly stochastic and partially observable, leading long-term planners to overfit to offline data in model-based settings. Input-driven Markov Decision Processes (IDMDPs) offer a way to work with some of the uncertainty by letting designers separate what the agent has control over (states) from what it cannot (inputs) in the environnement. These stochastic external inputs are often difficult to model. Under the assumption that the input model will be imperfect, we investigate the bias-variance tradeoff under shallow planning in IDMDPs. Paving the way to input-driven planning horizons, we also investigate the similarity of optimal planning horizons at different inputs given the structure of the input space.

2025-05-09

rl-conference.cc/RLC/2025/Conference (publié)

openreview.net