Liam Paull

Biography

Liam Paull is an associate professor at Université de Montréal and co-leads the Montréal Robotics and Embodied AI Lab (REAL). His lab focuses on a variety of robotics problems, including building representations of the world for such applications as simultaneous localization and mapping, modelling uncertainty, and building better workflows to teach robotic agents new tasks through, for example, simulation or demonstration.

Previously, Paull was a research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology (MIT), where he led the autonomous car project funded by the Toyota Research Institute (TRI). He completed a postdoc with the Marine Robotics Group at MIT, where he worked on Simultaneous Localization and Mapping (SLAM) for underwater robots.

His PhD from the University of New Brunswick in 2013 focused on robust and adaptive planning for underwater vehicles. He is also the co-founder and director of the Duckietown Foundation, which is dedicated to making engaging robotics learning experiences accessible to everyone.

Current Students

Francesco Argenziano

Independent visiting researcher - Sapienza

Ria Arora

Master's Research - Université de Montréal

Principal supervisor :

Guy Wolf

Adam Burhan

Master's Research - Université de Montréal

Rodrigue De Schaetzen

PhD - Université de Montréal

PhD - Université de Montréal

PhD - Université de Montréal

PhD - Université de Montréal

Co-supervisor :

Glen Berseth

Anshul Gupta

Collaborating researcher - Université de Montréal

Co-supervisor :

Sarath Chandar

Website

Zhen Liu

Collaborating Alumni - Université de Montréal

Co-supervisor :

Yoshua Bengio

Kaustubh Mani

PhD - Université de Montréal

Sacha Morin

PhD - Université de Montréal

Principal supervisor :

Postdoctorate - Université de Montréal

Yann Pequignot

Collaborating researcher - Université Laval

azalee.robitaille@hotmail.com

Azalee Robitaille

Master's Research - Université de Montréal

Luke Rowe

PhD - Université de Montréal

Co-supervisor :

Miguel Angel Saavedra Ruiz

PhD - Université de Montréal

miguel-angel.saavedra-ruiz@mila.quebec

Master's Research - Université de Montréal

How to Effectively and Efficiently Represent Non-Watertight Meshes for Your T-Shirts

Blog Posts

Visuel de l'Article sur la représentation du maillage non étanche de t-shirts

May 15, 2024

Zhen Liu

Yao Feng

Yuliang Xiu

Weiyang Liu

Liam Paull

Michael J. Black

Bernhard Scholkopf

Read the article

May 9, 2022

Sample Efficient Deep Reinforcement Learning Via Uncertainty Estimation

Vincent Mai

Kaustubh Mani

Liam Paull

Read the article

November 19, 2020

La-MAML: Look-ahead Meta-Learning for Continual Learning

Gunshi Gupta

Liam Paull

Read the article

Publications

Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving

Luke Rowe

Rodrigue de Schaetzen

Roger Girgis

We present Poutine, a 3B-parameter vision-language model (VLM) tailored for end-to-end autonomous driving in long-tail driving scenarios. Po… (see more)utine is trained in two stages. To obtain strong base driving capabilities, we train Poutine-Base in a self-supervised vision-language-trajectory (VLT) next-token prediction fashion on 83 hours of CoVLA nominal driving and 11 hours of Waymo long-tail driving. Accompanying language annotations are auto-generated with a 72B-parameter VLM. Poutine is obtained by fine-tuning Poutine-Base with Group Relative Policy Optimization (GRPO) using less than 500 preference-labeled frames from the Waymo validation set. We show that both VLT pretraining and RL fine-tuning are critical to attain strong driving performance in the long-tail. Poutine-Base achieves a rater-feedback score (RFS) of 8.12 on the validation set, nearly matching Waymo's expert ground-truth RFS. The final Poutine model achieves an RFS of 7.99 on the official Waymo test set, placing 1st in the 2025 Waymo Vision-Based End-to-End Driving Challenge by a significant margin. These results highlight the promise of scalable VLT pre-training and lightweight RL fine-tuning to enable robust and generalizable autonomy.

2025-06-12

ArXiv (preprint)

Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes

Anthony Gosselin

Ge Ya Luo

Luis Lara

Florian Golemo

Derek Nowrouzezahrai

Alexia Jolicoeur-Martineau

Video diffusion techniques have advanced significantly in recent years; however, they struggle to generate realistic imagery of car crashes … (see more)due to the scarcity of accident events in most driving datasets. Improving traffic safety requires realistic and controllable accident simulations. To tackle the problem, we propose Ctrl-Crash, a controllable car crash video generation model that conditions on signals such as bounding boxes, crash types, and an initial image frame. Our approach enables counterfactual scenario generation where minor variations in input can lead to dramatically different crash outcomes. To support fine-grained control at inference time, we leverage classifier-free guidance with independently tunable scales for each conditioning signal. Ctrl-Crash achieves state-of-the-art performance across quantitative video quality metrics (e.g., FVD and JEDi) and qualitative measurements based on a human-evaluation of physical realism and video quality compared to prior diffusion-based methods.

2025-05-30

ArXiv (preprint)

Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes

Anthony Gosselin

Ge Ya Luo

Luis Lara

Florian Golemo

Derek Nowrouzezahrai

Alexia Jolicoeur-Martineau

2025-05-30

ArXiv (preprint)

OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations

Christina Kassab

Sacha Morin

Martin Büchner

Matias Mattamala

Kumaraditya Gupta

Abhinav Valada

Maurice Fallon

2025-05-12

IEEE.org/ICRA/2025/Workshop/Safe-VLM (spotlight)

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

Luke Rowe

Roger Girgis

Anthony Gosselin

Felix Heide

2025-03-28

ArXiv (preprint)

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

Luke Rowe

Roger Girgis

Anthony Gosselin

Felix Heide

2025-03-28

ArXiv (preprint)

The Harmonic Exponential Filter for Nonparametric Estimation on Motion Groups

Miguel Saavedra-Ruiz

Steven A. Parkison

Ria Arora

James Richard Forbes

Bayesian estimation is a vital tool in robotics as it allows systems to update the robot state belief using incomplete information from nois… (see more)y sensors. To render the state estimation problem tractable, many systems assume that the motion and measurement noise, as well as the state distribution, are unimodal and Gaussian. However, there are numerous scenarios and systems that do not comply with these assumptions. Existing nonparametric filters that are used to model multimodal distributions have drawbacks that limit their ability to represent a diverse set of distributions. This paper introduces a novel approach to nonparametric Bayesian filtering on motion groups, designed to handle multimodal distributions using harmonic exponential distributions. This approach leverages two key insights of harmonic exponential distributions: a) the product of two distributions can be expressed as the element-wise addition of their log-likelihood Fourier coefficients, and b) the convolution of two distributions can be efficiently computed as the tensor product of their Fourier coefficients. These observations enable the development of an efficient and asymptotically exact solution to the Bayes filter up to the band limit of a Fourier transform. We demonstrate our filter's performance compared with established nonparametric filtering methods across simulated and real-world localization tasks.

2025-02-01

IEEE Robotics and Automation Letters (published)

Safety Representations for Safer Policy Learning

Kaustubh Mani

Vincent Mai

Charlie Gauthier

Annie S Chen

Samer B. Nashed

Reinforcement learning algorithms typically necessitate extensive exploration of the state space to find optimal policies. However, in safet… (see more)y-critical applications, the risks associated with such exploration can lead to catastrophic consequences. Existing safe exploration methods attempt to mitigate this by imposing constraints, which often result in overly conservative behaviours and inefficient learning. Heavy penalties for early constraint violations can trap agents in local optima, deterring exploration of risky yet high-reward regions of the state space. To address this, we introduce a method that explicitly learns state-conditioned safety representations. By augmenting the state features with these safety representations, our approach naturally encourages safer exploration without being excessively cautious, resulting in more efficient and safer policy learning in safety-critical scenarios. Empirical evaluations across diverse environments show that our method significantly improves task performance while reducing constraint violations during training, underscoring its effectiveness in balancing exploration with safety.

2025-01-22

ICLR.cc/2025/Conference (poster)

Safety Representations for Safer Policy Learning

Kaustubh Mani

Vincent Mai

Charlie Gauthier

Annie S Chen

Samer B. Nashed

2025-01-22

ICLR.cc/2025/Conference (poster)

Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

Manfred Diaz

Andrea Tacchetti

Teacher-Student Curriculum Learning (TSCL) is a curriculum learning framework that draws inspiration from human cultural transmission and le… (see more)arning. It involves a teacher algorithm shaping the learning process of a learner algorithm by exposing it to controlled experiences. Despite its success, understanding the conditions under which TSCL is effective remains challenging. In this paper, we propose a data-centric perspective to analyze the underlying mechanics of the teacher-student interactions in TSCL. We leverage cooperative game theory to describe how the composition of the set of experiences presented by the teacher to the learner, as well as their order, influences the performance of the curriculum that is found by TSCL approaches. To do so, we demonstrate that for every TSCL problem, there exists an equivalent cooperative game, and several key components of the TSCL framework can be reinterpreted using game-theoretic principles. Through experiments covering supervised learning, reinforcement learning, and classical games, we estimate the cooperative values of experiences and use value-proportional curriculum mechanisms to construct curricula, even in cases where TSCL struggles. The framework and experimental setup we present in this work represent a novel foundation for a deeper exploration of TSCL, shedding light on its underlying mechanisms and providing insights into its broader applicability in machine learning.

2024-09-17

TMLR (accepted)

CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning

Luke Rowe

Roger Girgis

Anthony Gosselin

Bruno Carrez

Florian Golemo

Felix Heide

Evaluating autonomous vehicle stacks (AVs) in simulation typically involves replaying driving logs from real-world recorded traffic. However… (see more), agents replayed from offline data do not react to the actions of the AV, and their behaviour cannot be easily controlled to simulate counterfactual scenarios. Existing approaches have attempted to address these shortcomings by proposing methods that rely on heuristics or learned generative models of real-world data but these approaches either lack realism or necessitate costly iterative sampling procedures to control the generated behaviours. In this work, we take an alternative approach and propose CtRL-Sim, a method that leverages return-conditioned offline reinforcement learning within a physics-enhanced Nocturne simulator to efficiently generate reactive and controllable traffic agents. Specifically, we process real-world driving data through the Nocturne simulator to generate a diverse offline reinforcement learning dataset, annotated with various reward terms. With this dataset, we train a return-conditioned multi-agent behaviour model that allows for fine-grained manipulation of agent behaviours by modifying the desired returns for the various reward components. This capability enables the generation of a wide range of driving behaviours beyond the scope of the initial dataset, including those representing adversarial behaviours. We demonstrate that CtRL-Sim can efficiently generate diverse and realistic safety-critical scenarios while providing fine-grained control over agent behaviours. Further, we show that fine-tuning our model on simulated safety-critical scenarios generated by our model enhances this controllability.

2024-09-05

robot-learning.org/CoRL/2024/Conference (accepted)

The Harmonic Exponential Filter for Nonparametric Estimation on Motion Groups

Miguel Saavedra-Ruiz

Steven A. Parkison

Ria Arora

James Richard Forbes

Bayesian estimation is a vital tool in robotics as it allows systems to update the robot state belief using incomplete information from nois… (see more)y sensors. To render the state estimation problem tractable, many systems assume that the motion and measurement noise, as well as the state distribution, are all unimodal and Gaussian. However, there are numerous scenarios and systems that do not comply with these assumptions. Existing nonparametric filters that are used to model multimodal distributions have drawbacks that limit their ability to represent a diverse set of distributions. This letter introduces a novel approach to nonparametric Bayesian filtering on motion groups, designed to handle multimodal distributions using harmonic exponential distributions. This approach leverages two key insights of harmonic exponential distributions: a) the product of two distributions can be expressed as the element-wise addition of their log-likelihood Fourier coefficients, and b) the convolution of two distributions can be efficiently computed as the tensor product of their Fourier coefficients. These observations enable the development of an efficient and asymptotically exact solution to the Bayes filter up to the band limit of a Fourier transform. We demonstrate our filter's superior performance compared with established nonparametric filtering methods across a range of simulated and real-world localization tasks.

2024-08-01

ArXiv (preprint)