Nicolas Thome

Collaborateur·rice de recherche - Sorbonne Université

Superviseur⋅e principal⋅e

Pablo Piantanida

Sujets de recherche

Apprentissage automatique médical

Apprentissage automatique pour les sciences physiques

Apprentissage multimodal

Apprentissage profond

Robotique

Sécurité de l'IA

Vision par ordinateur

Site web

Google Scholar

Publications

Revisiting the Learning Objectives of Vision-Language Reward Models

Simon Roy

Samuel Barbeau

Giovanni Beltrame

Christian Desrosiers

Nicolas Thome

Learning generalizable reward functions is a core challenge in embodied intelligence. Recent work leverages contrastive vision language mode… (voir plus)ls (VLMs) to obtain dense, domain-agnostic rewards without human supervision. These methods adapt VLMs into reward models through increasingly complex learning objectives, yet meaningful comparison remains difficult due to differences in training data, architectures, and evaluation settings. In this work, we isolate the impact of the learning objective by evaluating recent VLM-based reward models under a unified framework with identical backbones, finetuning data, and evaluation environments. Using Meta-World tasks, we assess modeling accuracy by measuring consistency with ground truth reward and correlation with expert progress. Remarkably, we show that a simple triplet loss outperforms state-of-the-art methods, suggesting that much of the improvements in recent approaches could be attributed to differences in data and architectures.

2025-12-19

ArXiv (prépublication)

doi.org

arxiv.org

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Nicolas Thome

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Nicolas Thome

Publications