Hugo Delhaye

Human-AI Alignment of Learning Trajectories in Video Games: a continual RL benchmark proposal

Yann Harel

François Paugam

We propose a design for a continual reinforcement learning (CRL) benchmark called GHAIA, centered on human-AI alignment of learning trajecto… (see more)ries in structured video game environments. Using \textit{Super Mario Bros.} as a case study, gameplay is decomposed into short, annotated scenes organized into diverse task sequences based on gameplay patterns and difficulty. Evaluation protocols measure both plasticity and stability, with flexible revisit and pacing schedules. A key innovation is the inclusion of high-resolution human gameplay data collected under controlled conditions, enabling direct comparison of human and agent learning. In addition to adapting classical CRL metrics like forgetting and backward transfer, we introduce semantic transfer metrics capturing learning over groups of scenes sharing similar game patterns. We demonstrate the feasibility of our approach on human and agent data, and discuss key aspects of the first release for community input.

2025-06-20

rl-conference.cc/RLC/2025/Workshop/RLVG (published)

openreview.net

Human-AI Alignment of Learning Trajectories in Video Games: a continual RL benchmark proposal

Yann Harel

Lune Bellec

François Paugam

Hugo Delhaye

Audrey Durand

We propose a design for a continual reinforcement learning (CRL) benchmark called GHAIA, centered on human-AI alignment of learning trajecto… (see more)ries in structured video game environments. Using \textit{Super Mario Bros.} as a case study, gameplay is decomposed into short, annotated scenes organized into diverse task sequences based on gameplay patterns and difficulty. Evaluation protocols measure both plasticity and stability, with flexible revisit and pacing schedules. A key innovation is the inclusion of high-resolution human gameplay data collected under controlled conditions, enabling direct comparison of human and agent learning. In addition to adapting classical CRL metrics like forgetting and backward transfer, we introduce semantic transfer metrics capturing learning over groups of scenes sharing similar game patterns. We demonstrate the feasibility of our approach on human and agent data, and discuss key aspects of the first release for community input.

2025-06-20

rl-conference.cc/RLC/2025/Workshop/RLVG (accepted)

openreview.net