Tom Stanic

Stagiaire de recherche - UdeM

Superviseur⋅e principal⋅e

Gauthier Gidel

Sujets de recherche

Apprentissage par renforcement

Site web

GitHub

Publications

Beyond Reward Maximization: Evaluating the Diversity of Trajectories in Reinforcement Learning with Temporal Vendi Score

In domains such as scientific discovery and automated design using reinforcement learning (RL), the final task of an agent should extend bey… (voir plus)ond maximising a single scalar reward; it requires identifying diverse sets of high-quality trajectories to uncover distinct solutions that can provide novel insights on how to solve the problems of interest and transfer robustly from simulation to the real world. However, the RL literature currently lacks a holistic, domain-agnostic standard for measuring trajectory diversity. Existing metrics have been developed to improve exploration at training time but not to evaluate and compare diversity induced by different agents, rendering cross-method comparisons inconsistent and challenging. To address this, we introduce the Temporal Vendi Score (TVS), a novel metric designed to evaluate the diversity of an RL agent by computing the entropy of the eigenvalues' similarity matrix of sampled trajectories. Unlike previous approaches, our metric captures the behavioural diversity of trajectories by accounting for both the sequential nature of state visitations and the temporal structure of the underlying MDP, rather than relying on order-agnostic state comparisons. We validate the TVS on simple environments where we can control the number of different ways a problem can be solved, demonstrating that it provides a more robust, semantically meaningful ranking of diversity than standard baselines. We then show that our metric can scale to a high-dimensional, continuous environment.

2026-03-01

LLA @ International Conference on Learning Representations (poster)

openreview.net

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Tom Stanic

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Tom Stanic

Publications