Akash Karthikeyan

PhD - Université de Montréal

Supervisor

Pierre-Luc Bacon

Research Topics

Generative Models

Reinforcement Learning

Website

Google Scholar

GitHub

Publications

From Static Policies to Adaptive Priors in Offline Reinforcement Learning

Offline reinforcement learning (RL) has traditionally focused on learning policies for direct deployment under conservative objectives, wher… (see more)e uncertainty outside the offline dataset is treated pessimistically to ensure robustness. We argue that this formulation becomes incomplete when an offline-trained policy is subsequently updated through online interaction, as increasingly occurs in modern intelligent systems through test-time adaptation and online fine-tuning. This position paper argues that, in such settings, the objective of offline RL should extend beyond immediate deployment and instead prioritize learning *adaptive policy priors*: policies that preserve the capacity to improve during subsequent interaction through memory, exploration, and self-correction. We formalize this perspective as *adaptive offline reinforcement learning* (AORL), distinguish it from offline-to-online RL, and explain why adaptability becomes important under distributional shift, limited dataset coverage, and changing test-time conditions. We further discuss Bayesian offline RL as one principled direction for constructing adaptive policy priors by preserving epistemic uncertainty over plausible environments. Finally, we outline connections, open challenges, and research directions for treating offline RL as preparation for future experience rather than as a static deployment problem.

2026-05-24

DEMO @ International Conference on Machine Learning (poster)

openreview.net

Mila Ventures Launchpad

AI Policy Compass

AI Policy Fellowship Publications

Akash Karthikeyan

Publications

Mila Ventures Launchpad

AI Policy Compass

AI Policy Fellowship Publications

Popular keywords:

Akash Karthikeyan

Publications