Yudong Luo

Postdoctorat - HEC

Superviseur⋅e principal⋅e

Erick Delage

Sujets de recherche

Apprentissage par renforcement

Publications

Actor-Critic Algorithm for Dynamic Expectile and CVaR

Yudong Luo

Erick Delage

Optimizing dynamic risk with stochastic policies is challenging in both policy updates and value learning. The former typically requires tra… (voir plus)nsition perturbation, while the latter may rely on model-based approaches. To address these challenges, we propose a surrogate policy gradient without transition perturbation under softmax policy parameterization. We further develop model-free value learning methods for dynamic expectile and conditional value-at-risk by leveraging elicitability. Finally, inspired by Expected SARSA and Expected Policy Gradient, a model-free off-policy actor-critic algorithm is constructed. Empirical results in domains with verifiable risk-averse behavior show that our algorithm can learn risk-averse policy and consistently outperforms other existing methods.

2026-05-07

arXiv (prépublication)

doi.org

arxiv.org

Boosting CVaR Policy Optimization with Quantile Gradients

Yudong Luo

Erick Delage

Optimizing Conditional Value-at-risk (CVaR) using policy gradient (a.k.a CVaR-PG) faces significant challenges of sample inefficiency. This … (voir plus)inefficiency stems from the fact that it focuses on tail-end performance and overlooks many sampled trajectories. We address this problem by augmenting CVaR with an expected quantile term. Quantile optimization admits a dynamic programming formulation that leverages all sampled data, thus improves sample efficiency. This does not alter the CVaR objective since CVaR corresponds to the expectation of quantile over the tail. Empirical results in domains with verifiable risk-averse behavior show that our algorithm within the Markovian policy class substantially improves upon CVaR-PG and consistently outperforms other existing methods.

2026-01-28

arXiv (prépublication)

doi.org

arxiv.org

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Yudong Luo

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Yudong Luo

Publications