Yudong Luo

Postdoctorate - HEC Montréal

Supervisor

Erick Delage

Research Topics

Reinforcement Learning

Publications

Actor-Critic Algorithm for Dynamic Expectile and CVaR

Yudong Luo

Erick Delage

Optimizing dynamic risk with stochastic policies is challenging in both policy updates and value learning. The former typically requires tra… (see more)nsition perturbation, while the latter may rely on model-based approaches. To address these challenges, we propose a surrogate policy gradient without transition perturbation under softmax policy parameterization. We further develop model-free value learning methods for dynamic expectile and conditional value-at-risk by leveraging elicitability. Finally, inspired by Expected SARSA and Expected Policy Gradient, a model-free off-policy actor-critic algorithm is constructed. Empirical results in domains with verifiable risk-averse behavior show that our algorithm can learn risk-averse policy and consistently outperforms other existing methods.

2026-05-07

arXiv (preprint)

doi.org

arxiv.org

Boosting CVaR Policy Optimization with Quantile Gradients

Yudong Luo

Erick Delage

Optimizing Conditional Value-at-risk (CVaR) using policy gradient (a.k.a CVaR-PG) faces significant challenges of sample inefficiency. This … (see more)inefficiency stems from the fact that it focuses on tail-end performance and overlooks many sampled trajectories. We address this problem by augmenting CVaR with an expected quantile term. Quantile optimization admits a dynamic programming formulation that leverages all sampled data, thus improves sample efficiency. This does not alter the CVaR objective since CVaR corresponds to the expectation of quantile over the tail. Empirical results in domains with verifiable risk-averse behavior show that our algorithm within the Markovian policy class substantially improves upon CVaR-PG and consistently outperforms other existing methods.

2026-01-28

arXiv (preprint)

doi.org

arxiv.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Yudong Luo

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Yudong Luo

Publications