Santhoshi Ravichandran

Alumni

Site web

Publications

How to Train Your LLM Web Agent: A Statistical Diagnosis

Dheeraj Vattikonda

Santhoshi Ravichandran

Emiliano Penaloza

Hadi Nekoei

Megh Thakkar

Thibault Le Sellier De Chezelles

Nicolas Gontier

Miguel Muñoz-Mármol

Sahar Omidi Shayegan

Stefania Raimondo

Xue Liu

Alexandre Drouin

Laurent Charlin

Alexandre Piché

Alexandre Lacoste

Massimo Caccia

LLM-based web agents have recently made significant progress, but much of it has occurred in closed-source systems, widening the gap with op… (voir plus)en-source alternatives. Progress has been held back by two key challenges: first, a narrow focus on single-step tasks that overlooks the complexity of multi-step web interactions; and second, the high compute costs required to post-train LLM-based web agents. To address this, we present the first statistically grounded study on compute allocation for LLM web-agent post-training. Our approach uses a two-stage pipeline, training a Llama 3.1 8B student to imitate a Llama 3.3 70B teacher via supervised fine-tuning (SFT), followed by on-policy reinforcement learning. We find this process highly sensitive to hyperparameter choices, making exhaustive sweeps impractical. To spare others from expensive trial-and-error, we sample 1,370 configurations and use bootstrapping to estimate effective hyperparameters. Our results show that combining SFT with on-policy RL consistently outperforms either approach alone on both WorkArena and MiniWob++. Further, this strategy requires only 55% of the compute to match the peak performance of pure SFT on MiniWob++, effectively pushing the compute-performance Pareto frontier, and is the only strategy that can close the gap with closed-source models.

2025-09-17

NeurIPS.cc/2025/Conference (poster)

doi.org

openreview.net

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Santhoshi Ravichandran

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Santhoshi Ravichandran

Publications