Sumana Basu

Doctorat - McGill

Superviseur⋅e principal⋅e

Doina Precup

Co-supervisor

Adriana Romero Soriano

Sujets de recherche

Apprentissage par renforcement

Site web

Google Scholar

GitHub

Publications

Reward the Reward Designer: Making Reinforcement Learning Useful for Clinical Decision Making

Sumana Basu

Adriana Romero Soriano

Doina Precup

2025-09-22

NeurIPS.cc/2025/Workshop/WiML (publié)

openreview.net

On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects

Sumana Basu

M. Legault

Adriana Romero Soriano

Doina Precup

Drug dosing is an important application of AI, which can be formulated as a Reinforcement Learning (RL) problem. In this paper, we identify … (voir plus)two major challenges of using RL for drug dosing: delayed and prolonged effects of administering medications, which break the Markov assumption of the RL framework. We focus on prolongedness and define PAE-POMDP (Prolonged Action Effect-Partially Observable Markov Decision Process), a subclass of POMDPs in which the Markov assumption does not hold specifically due to prolonged effects of actions. Motivated by the pharmacology literature, we propose a simple and effective approach to converting drug dosing PAE-POMDPs into MDPs, enabling the use of the existing RL algorithms to solve such problems. We validate the proposed approach on a toy task, and a challenging glucose control task, for which we devise a clinically-inspired reward function. Our results demonstrate that: (1) the proposed method to restore the Markov assumption leads to significant improvements over a vanilla baseline; (2) the approach is competitive with recurrent policies which may inherently capture the prolonged affect of actions; (3) it is remarkably more time and memory efficient than the recurrent baseline and hence more suitable for real-time dosing control systems; and (4) it exhibits favourable qualitative behavior in our policy analysis.

2023-01-02

ArXiv (prépublication)

doi.org

arxiv.org

Towards Safe Mechanical Ventilation Treatment Using Deep Offline Reinforcement Learning

Flemming Kondrup

Thomas Jiralerspong

Elaine Lau

Nathan de Lara

Jacob A. Shkrob

My Duc Tran

Doina Precup