Glen Berseth

Google Scholar

Biographie

Glen Berseth est professeur agrégé au Département d'informatique et de recherche opérationnelle (DIRO) de l'Université de Montréal, membre académique principal de Mila – Institut québécois d'intelligence artificielle, détenteur d’une chaire en IA Canada-CIFAR et codirecteur du Laboratoire de robotique et d’IA intégrative de Montréal (REAL). Il a été chercheur postdoctoral à Berkeley Artificial Intelligence Research (BAIR), où il a travaillé avec Sergey Levine. Ses recherches portent sur la résolution de problèmes de prise de décision séquentielle (planification) pour les systèmes d'apprentissage autonomes du monde réel (robots). Elles ont couvert les domaines de la collaboration humain-robot, du renforcement, ainsi que de l'apprentissage continu, multiagent et hiérarchique et du méta-apprentissage. Glen Berseth a fait paraître des articles dans les meilleures publications des domaines de la robotique, de l'apprentissage automatique et de l'animation informatique. Il donne également un cours sur l'apprentissage des robots à l'Université de Montréal et à Mila, couvrant les recherches les plus récentes sur les techniques d'apprentissage automatique pour la création de robots généralistes.

Étudiants actuels

Doctorat - UdeM

Maîtrise recherche - UdeM

Florence Cloutier

Maîtrise recherche - UdeM

Charlotte Cloutier

Collaborateur·rice de recherche - Waterloo

ccloutie@uwaterloo.ca

Roger Creus-Castanyer

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - McGill

Superviseur⋅e principal⋅e :

Hsiu-Chin Lin

Léa Demeule

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Liam Paull

Homayoun Honari

Collaborateur·rice de recherche - UdeM

Adriana Knatchbull-Hugessen

Google Scholar

Doctorat - UdeM

Artur Kuramshin

Maîtrise recherche - UdeM

Daniel Lawson

Doctorat - UdeM

Co-superviseur⋅e :

Postdoctorat - UdeM

Co-superviseur⋅e :

Maîtrise recherche - UdeM

Postdoctorat - UdeM

Co-superviseur⋅e :

Maîtrise professionnelle - UdeM

L'apprentissage par renforcement en temps réel

Google Scholar

Michael Przystupa

Stagiaire de recherche - UdeM

Esra'a Saleh

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Billets de blogue

Deux robots dans une cuisine, en train de préparer le dîner. L'un coupe les légumes et l'autre fait une omelette.

20 juin 2025

par

Ivan Anokhin

Matthew Riemer

Rishav Rishav

Gopeshh Subbaraj

Glen Berseth

Lire l'article

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

15 février 2023

Apprentissage par renforcement entièrement autonome dans le monde réel avec des applications à la manipulation mobile

par

Jędrzej Orbik

Charles Sun

Coline Devin

Glen Berseth

Lire l'article

Publications

Closing the Gap between TD Learning and Supervised Learning -- A Generalisation Point of View.

Raj Ghugare

Matthieu Geist

Benjamin Eysenbach

Some reinforcement learning (RL) algorithms have the capability of recombining together pieces of previously seen experience to solve a task… (voir plus) never seen before during training. This oft-sought property is one of the few ways in which dynamic programming based RL algorithms are considered different from supervised learning (SL) based RL algorithms. Yet, recent RL methods based on off-the-shelf SL algorithms achieve excellent results without an explicit mechanism for stitching; it remains unclear whether those methods forgo this important stitching property. This paper studies this question in the setting of goal-reaching problems. We show that the desirable stitching property corresponds to a form of generalization: after training on a distribution of (state, goal) pairs, one would like to evaluate on (state, goal) pairs not seen \emph{together} in the training data. Our analysis shows that this sort of generalization is different from \emph{i.i.d.} generalization. This connection between stitching and generalization reveals why we should not expect existing RL methods based on SL to perform stitching, even in the limit of large datasets and models. We experimentally validate this result on carefully constructed datasets. This connection suggests a simple remedy, the same remedy for improving generalization in supervised learning: data augmentation. We propose a naive \emph{temporal} data augmentation approach and demonstrate that adding it to RL methods based on SL enables them to stitch together experience so that they succeed in navigating between states and goals unseen together during training.

2024-01-16

ICLR.cc/2024/Conference (poster)

Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View

Raj Ghugare

Matthieu Geist

Benjamin Eysenbach

Some reinforcement learning (RL) algorithms can stitch pieces of experience to solve a task never seen before during training. This oft-soug… (voir plus)ht property is one of the few ways in which RL methods based on dynamic-programming differ from RL methods based on supervised-learning (SL). Yet, certain RL methods based on off-the-shelf SL algorithms achieve excellent results without an explicit mechanism for stitching; it remains unclear whether those methods forgo this important stitching property. This paper studies this question for the problems of achieving a target goal state and achieving a target return value. Our main result is to show that the stitching property corresponds to a form of combinatorial generalization: after training on a distribution of (state, goal) pairs, one would like to evaluate on (state, goal) pairs not seen together in the training data. Our analysis shows that this sort of generalization is different from i.i.d. generalization. This connection between stitching and generalisation reveals why we should not expect SL-based RL methods to perform stitching, even in the limit of large datasets and models. Based on this analysis, we construct new datasets to explicitly test for this property, revealing that SL-based methods lack this stitching property and hence fail to perform combinatorial generalization. Nonetheless, the connection between stitching and combinatorial generalisation also suggests a simple remedy for improving generalisation in SL: data augmentation. We propose a temporal data augmentation and demonstrate that adding it to SL-based methods enables them to successfully complete tasks not seen together during training. On a high level, this connection illustrates the importance of combinatorial generalization for data efficiency in time-series data beyond tasks beyond RL, like audio, video, or text.

2024-01-16

ICLR.cc/2024/Conference (poster)

Improving Intrinsic Exploration by Creating Stationary Objectives

Roger Creus Castanyer

Joshua Romoff

2024-01-16

ICLR.cc/2024/Conference (poster)

Intelligent Switching for Reset-Free RL

Darshan Patil

Janarthanan Rajendran

Sarath Chandar

2024-01-16

ICLR.cc/2024/Conference (poster)

Intelligent Switching for Reset-Free RL

Darshan Patil

Janarthanan Rajendran

Sarath Chandar

In the real world, the strong episode resetting mechanisms that are needed to train agents in simulation are unavailable. The \textit{resett… (voir plus)ing} assumption limits the potential of reinforcement learning in the real world, as providing resets to an agent usually requires the creation of additional handcrafted mechanisms or human interventions. Recent work aims to train agents (\textit{forward}) with learned resets by constructing a second (\textit{backward}) agent that returns the forward agent to the initial state. We find that the termination and timing of the transitions between these two agents are crucial for algorithm success. With this in mind, we create a new algorithm, Reset Free RL with Intelligently Switching Controller (RISC) which intelligently switches between the two agents based on the agent's confidence in achieving its current goal. Our new method achieves state-of-the-art performance on several challenging environments for reset-free RL.

2024-01-16

ICLR.cc/2024/Conference (poster)

Reasoning with Latent Diffusion in Offline Reinforcement Learning

Siddarth Venkatraman

Shivesh Khaitan

Ravi Tej Akella

John Dolan

Jeff Schneider

2024-01-16

ICLR.cc/2024/Conference (poster)

Reasoning with Latent Diffusion in Offline Reinforcement Learning

Siddarth Venkatraman

Shivesh Khaitan

Ravi Tej Akella

John Dolan

Jeff Schneider

2024-01-16

ICLR.cc/2024/Conference (poster)

Searching for High-Value Molecules Using Reinforcement Learning and Transformers

Raj Ghugare

Santiago Miret

Adriana Hugessen

Mariano Phielipp

2024-01-16

ICLR.cc/2024/Conference (poster)

Adaptive Resolution Residual Networks

Léa Demeule

Mahtab Sandhu

We introduce Adaptive Resolution Residual Networks (ARRNs), a form of neural operator that enables the creation of networks for signal-based… (voir plus) tasks that can be rediscretized to suit any signal resolution. ARRNs are composed of a chain of Laplacian residuals that each contain ordinary layers, which do not need to be rediscretizable for the whole network to be rediscretizable. ARRNs have the property of requiring a lower number of Laplacian residuals for exact evaluation on lower-resolution signals, which greatly reduces computational cost. ARRNs also implement Laplacian dropout, which encourages networks to become robust to low-bandwidth signals. ARRNs can thus be trained once at high-resolution and then be rediscretized on the fly at a suitable resolution with great robustness.

2023-10-31

NeurIPS.cc/2023/Workshop/DLDE (published)

Improving Intrinsic Exploration by Creating Stationary Objectives

Roger Creus Castanyer

Joshua Romoff

2023-10-27

ArXiv (prépublication)

arxiv.org

Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning

Adriana Hugessen

Roger Creus Castanyer

2023-10-20

NeurIPS.cc/2023/Workshop/IMOL (présentation orale)

Searching for High-Value Molecules Using Reinforcement Learning and Transformers

Raj Ghugare

Santiago Miret

Adriana Hugessen

Mariano Phielipp

Reinforcement learning (RL) over text representations can be effective for finding high-value policies that can search over graphs. However,… (voir plus) RL requires careful structuring of the search space and algorithm design to be effective in this challenge. Through extensive experiments, we explore how different design choices for text grammar and algorithmic choices for training can affect an RL policy's ability to generate molecules with desired properties. We arrive at a new RL-based molecular design algorithm (ChemRLformer) and perform a thorough analysis using 25 molecule design tasks, including computationally complex protein docking simulations. From this analysis, we discover unique insights in this problem space and show that ChemRLformer achieves state-of-the-art performance while being more straightforward than prior work by demystifying which design choices are actually helpful for text-based molecule design.

2023-10-04

ArXiv (prépublication)