Accueil

Inspirer le développement de l'intelligence artificielle au bénéfice de tous·tes

Un professeur s'entretient avec ses étudiants dans un café/lounge.

Situé au cœur de l’écosystème québécois en intelligence artificielle (IA), Mila rassemble une communauté de plus de 1200 personnes spécialisées en apprentissage automatique et dédiées à l’excellence scientifique et l’innovation.

À propos

À la une
À la une
À la une

Corps professoral

Fondé en 1993 par le professeur Yoshua Bengio, Mila regroupe aujourd'hui plus de 140 professeur·e·s affilié·e·s à l'Université de Montréal, l'Université McGill, Polytechnique Montréal et HEC Montréal. L'institut accueille également des professeur·e·s de l'Université Laval, de l'Université de Sherbrooke, de l'École de technologie supérieure (ÉTS) et de l'Université Concordia.

Consultez l'annuaire en ligne

Photo de Yoshua Bengio

Publications récentes

Sparse Decomposition of Graph Neural Networks
Yaochen Hu
Mai Zeng
Ge Zhang
Pavel Rumiantsev
Liheng Ma
Yingxue Zhang
The BrowserGym Ecosystem for Web Agent Research
Thibault Le Sellier de Chezelles
Alexandre Lacoste
Massimo Caccia
Léo Boisvert
Megh Thakkar
Tom Marty
Rim Assouel
Sahar Omidi Shayegan
Lawrence Keunho Jang
Xing Han Lu
Ori Yoran
Dehan Kong
Frank F. Xu
Graham Neubig
Russ Salakhutdinov
The BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents, particularly those leveraging a… (voir plus)utomation and Large Language Models (LLMs) for web interaction tasks. Many existing benchmarks suffer from fragmentation and inconsistent evaluation methodologies, making it challenging to achieve reliable comparisons and reproducible results. BrowserGym aims to solve this by providing a unified, gym-like environment with well-defined observation and action spaces, facilitating standardized evaluation across diverse benchmarks. Combined with AgentLab, a complementary framework that aids in agent creation, testing, and analysis, BrowserGym offers flexibility for integrating new benchmarks while ensuring consistent evaluation and comprehensive experiment management. This standardized approach seeks to reduce the time and complexity of developing web agents, supporting more reliable comparisons and facilitating in-depth analysis of agent behaviors, and could result in more adaptable, capable agents, ultimately accelerating innovation in LLM-driven automation. As a supporting evidence, we conduct the first large-scale, multi-benchmark web agent experiment and compare the performance of 6 state-of-the-art LLMs across all benchmarks currently available in BrowserGym. Among other findings, our results highlight a large discrepancy between OpenAI and Anthropic's latests models, with Claude-3.5-Sonnet leading the way on almost all benchmarks, except on vision-related tasks where GPT-4o is superior. Despite these advancements, our results emphasize that building robust and efficient web agents remains a significant challenge, due to the inherent complexity of real-world web environments and the limitations of current models.
NNetNav: Unsupervised Learning of Browser Agents Through Environment Interaction in the Wild
Shikhar Murty
Hao Zhu
Christopher D Manning
We introduce NNetNav, a method for unsupervised interaction with websites that generates synthetic demonstrations for training browser agent… (voir plus)s. Given any website, NNetNav produces these demonstrations by retroactively labeling action sequences from an exploration policy. Most work on training browser agents has relied on expensive human supervision, and the limited prior work on such interaction-based techniques has failed to provide effective search through the exponentially large space of exploration. In contrast, NNetNav exploits the hierarchical structure of language instructions to make this search more tractable: Complex instructions are typically decomposable into simpler sub-tasks, allowing NNetNav to automatically prune interaction episodes when an intermediate trajectory cannot be annotated with a meaningful sub-task. \texttt{LLama-3.1-8b} finetuned on 10k NNetNav self-generated demonstrations obtains over 16\% success rate on WebArena, and 35\% on WebVoyager, an improvement of 15pts and 31pts respectively over zero-shot \texttt{LLama-3.1-8b}, outperforming zero-shot GPT-4 and reaching the state-of-the-art among unsupervised methods, for both benchmarks.
NNetNav: Unsupervised Learning of Browser Agents Through Environment Interaction in the Wild
Shikhar Murty
Hao Zhu
Christopher D Manning

IA pour l'humanité

Le développement socialement responsable et bénéfique de l'IA est une dimension fondamentale de la mission de Mila. En tant que chef de file, nous souhaitons contribuer au dialogue social et au développement d'applications qui seront bénéfiques pour la société.

En savoir plus

Une personne regarde un ciel étoilé.