Portrait of Glen Berseth

Glen Berseth

Core Academic Member
Canada CIFAR AI Chair
Assistant Professor, Université de Montréal, Department of Computer Science and Operations Research
Research Topics
Deep Learning
Reinforcement Learning
Robotics

Biography

Glen Berseth is an assistant professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal and a core academic member of Mila – Quebec Artificial Intelligence Institute.

He is a Canada CIFAR AI Chair and co-directs the Robotics and Embodied AI Lab (REAL). He was formerly a postdoctoral researcher at Berkeley Artificial Intelligence Research (BAIR), working with Sergey Levine.

Berseth’s previous and current research has focused on solving sequential decision-making problems (planning) for real-world autonomous learning systems (robots). More specifically, his research has focused on human-robot collaboration, reinforcement, and continual-, meta-, multi-agent and hierarchical learning.

He has published in the top venues in robotics, machine learning and computer animation. He teaches a course on robot learning at Université de Montréal and at Mila, in which he covers the most recent research on machine learning techniques for creating generalist robots.

Current Students

PhD - Université de Montréal
Master's Research - Université de Montréal
Collaborating researcher - Waterloo
Master's Research - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
PhD - McGill University
Principal supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher - Université de Montréal
PhD - Université de Montréal
Master's Research - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
Postdoctorate - Université de Montréal
Co-supervisor :
Master's Research - Université de Montréal
Postdoctorate - Université de Montréal
Co-supervisor :
Professional Master's - Université de Montréal
Research Intern - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :

Publications

Searching for High-Value Molecules Using Reinforcement Learning and Transformers
Raj Ghugare
Santiago Miret
Adriana Hugessen
Mariano Phielipp
Adaptive Resolution Residual Networks
We introduce Adaptive Resolution Residual Networks (ARRNs), a form of neural operator that enables the creation of networks for signal-based… (see more) tasks that can be rediscretized to suit any signal resolution. ARRNs are composed of a chain of Laplacian residuals that each contain ordinary layers, which do not need to be rediscretizable for the whole network to be rediscretizable. ARRNs have the property of requiring a lower number of Laplacian residuals for exact evaluation on lower-resolution signals, which greatly reduces computational cost. ARRNs also implement Laplacian dropout, which encourages networks to become robust to low-bandwidth signals. ARRNs can thus be trained once at high-resolution and then be rediscretized on the fly at a suitable resolution with great robustness.
Improving Intrinsic Exploration by Creating Stationary Objectives
Roger Creus Castanyer
Joshua Romoff
Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning
Adriana Hugessen
Roger Creus Castanyer
Searching for High-Value Molecules Using Reinforcement Learning and Transformers
Raj Ghugare
Santiago Miret
Adriana Hugessen
Mariano Phielipp
Reinforcement learning (RL) over text representations can be effective for finding high-value policies that can search over graphs. However,… (see more) RL requires careful structuring of the search space and algorithm design to be effective in this challenge. Through extensive experiments, we explore how different design choices for text grammar and algorithmic choices for training can affect an RL policy's ability to generate molecules with desired properties. We arrive at a new RL-based molecular design algorithm (ChemRLformer) and perform a thorough analysis using 25 molecule design tasks, including computationally complex protein docking simulations. From this analysis, we discover unique insights in this problem space and show that ChemRLformer achieves state-of-the-art performance while being more straightforward than prior work by demystifying which design choices are actually helpful for text-based molecule design.
Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning
Jensen Gao
Siddharth Reddy
Anca Dragan
Sergey Levine
Adaptive interfaces can help users perform sequential decision-making tasks like robotic teleoperation given noisy, high-dimensional command… (see more) signals (e.g., from a brain-computer interface). Recent advances in human-in-the-loop machine learning enable such systems to improve by interacting with users, but tend to be limited by the amount of data that they can collect from individual users in practice. In this paper, we propose a reinforcement learning algorithm to address this by training an interface to map raw command signals to actions using a combination of offline pre-training and online fine-tuning. To address the challenges posed by noisy command signals and sparse rewards, we develop a novel method for representing and inferring the user's long-term intent for a given trajectory. We primarily evaluate our method's ability to assist users who can only communicate through noisy, high-dimensional input channels through a user study in which 12 participants performed a simulated navigation task by using their eye gaze to modulate a 128-dimensional command signal from their webcam. The results show that our method enables successful goal navigation more often than a baseline directional interface, by learning to denoise user commands signals and provide shared autonomy assistance. We further evaluate on a simulated Sawyer pushing task with eye gaze control, and the Lunar Lander game with simulated user commands, and find that our method improves over baseline interfaces in these domains as well. Extensive ablation experiments with simulated user commands empirically motivate each component of our method.
Torque-Based Deep Reinforcement Learning for Task-and-Robot Agnostic Learning on Bipedal Robots Using Sim-to-Real Transfer
Donghyeon Kim
Mathew Schwartz
Jaeheung Park
In this letter, we review the question of which action space is best suited for controlling a real biped robot in combination with Sim2Real … (see more)training. Position control has been popular as it has been shown to be more sample efficient and intuitive to combine with other planning algorithms. However, for position control, gain tuning is required to achieve the best possible policy performance. We show that, instead, using a torque-based action space enables task-and-robot agnostic learning with less parameter tuning and mitigates the sim-to-reality gap by taking advantage of torque control's inherent compliance. Also, we accelerate the torque-based-policy training process by pre-training the policy to remain upright by compensating for gravity. The letter showcases the first successful sim-to-real transfer of a torque-based deep reinforcement learning policy on a real human-sized biped robot.
Maximum State Entropy Exploration using Predecessor and Successor Representations
Animals have a developed ability to explore that aids them in important tasks such as locating food, exploring for shelter, and finding misp… (see more)laced items. These exploration skills necessarily track where they have been so that they can plan for finding items with relative efficiency. Contemporary exploration algorithms often learn a less efficient exploration strategy because they either condition only on the current state or simply rely on making random open-loop exploratory moves. In this work, we propose
Robust and Versatile Bipedal Jumping Control through Reinforcement Learning
Zhongyu Li
Xue Bin Peng
Pieter Abbeel
Sergey Levine
Koushil Sreenath
Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning
Zhongyu Li
Xue Bin Peng
Pieter Abbeel
Sergey Levine
Koushil Sreenath
Towards Learning to Imitate from a Single Video Demonstration
Agents that can learn to imitate given video observation -- \emph{without direct access to state or action information} are more applicable … (see more)to learning in the natural world. However, formulating a reinforcement learning (RL) agent that facilitates this goal remains a significant challenge. We approach this challenge using contrastive training to learn a reward function comparing an agent's behaviour with a single demonstration. We use a Siamese recurrent neural network architecture to learn rewards in space and time between motion clips while training an RL policy to minimize this distance. Through experimentation, we also find that the inclusion of multi-task data and additional image encoding losses improve the temporal consistency of the learned rewards and, as a result, significantly improves policy learning. We demonstrate our approach on simulated humanoid, dog, and raptor agents in 2D and a quadruped and a humanoid in 3D. We show that our method outperforms current state-of-the-art techniques in these environments and can learn to imitate from a single video demonstration.
Hierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal Robot
Yandong Ji
Zhongyu Li
Yinan Sun
Xue Bin Peng
Sergey Levine
Koushil Sreenath
We address the problem of enabling quadrupedal robots to perform precise shooting skills in the real world using reinforcement learning. Dev… (see more)eloping algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task. To solve this problem, we need to consider the dynamics limitation and motion stability during the control of a dynamic legged robot. Moreover, we need to consider motion planning to shoot the hard-to-model deformable ball rolling on the ground with uncertain friction to a desired location. In this paper, we propose a hierarchical framework that leverages deep reinforcement learning to train (a) a robust motion control policy that can track arbitrary motions and (b) a planning policy to decide the desired kicking motion to shoot a soccer ball to a target. We deploy the proposed framework on an A1 quadrupedal robot and enable it to accurately shoot the ball to random targets in the real world.