Liam Paull

Comment représenter efficacement le maillage non étanche de t-shirts?

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - Université Laval

Google Scholar

Azalee Robitaille

Maîtrise recherche - UdeM

Luke Rowe

Doctorat - UdeM

Co-superviseur⋅e :

Miguel Angel Saavedra Ruiz

Doctorat - UdeM

Collaborateur·rice de recherche - Université de Montréal

Site web

Github

Google Scholar

Billets de blogue

Visuel de l'Article sur la représentation du maillage non étanche de t-shirts

15 mai 2024

par

Zhen Liu

Yao Feng

Yuliang Xiu

Weiyang Liu

Liam Paull

Michael J. Black

Bernhard Scholkopf

Lire l'article

Sample Efficient Deep Reinforcement Learning Via Uncertainty Estimation

9 mai 2022

Estimation d’incertitude pour un apprentissage par renforcement profond plus efficient

par

Vincent Mai

Kaustubh Mani

Liam Paull

Lire l'article

La-MAML: Look-ahead Meta-Learning for Continual Learning

19 novembre 2021

Méta-apprentissage prospectif pour l’apprentissage continu (La-MAML)

par

Gunshi Gupta

Liam Paull

Lire l'article

Publications

PerceptTwin: Semantic Scene Reconstruction for Iterative LLM Planning and Verification

Charlie Gauthier

Simulation environments are useful for both robot policy learning and planning verification and validation. Traditionally, the process of cr… (voir plus)eating a simulation was onerous. Creating a bespoke simulation environment for each individual environment that a robot would operate in was simply infeasible. In this work, we introduce PerceptTwin, a fully automatic pipeline that constructs interactive simulations directly from semantic scene representations produced by a robot's perception stack. PerceptTwin combines open-vocabulary object maps with 3D asset generation, affordance prediction, and commonsense condition checking. These interactive simulations can be used to validate and refine plans before they are executed on the robot hardware. Borrowing from the AI alignment literature, we also introduce an LLM judge that verifies plan correctness and alignment with human preferences. Experiments show that PerceptTwin feedback allows LLM planners to refine plans, enhance safety, and resist harmful black-box prompting attacks. In our suite of tasks, PerceptTwin improves plan success by an average of approximately 39% for GPT5, GPT5Mini, and GPT5Nano planners. Additionally, PerceptTwin also improves human plan verification by up to 18% on average for plans that fail due to unfilled skill preconditions. Our results demonstrate the potential of open-vocabulary scene simulation from robot perception as a foundation for safer, more reliable robot planning.

2026-06-01

arXiv (prépublication)

Predictive Spatio-Temporal Scene Graphs for Semi-Static Scenes

Miguel Saavedra-Ruiz

Steven Parkison

We have seen tremendous recent progress in our ability to build "spatio-semantic" representations that enable robots to perform complex reas… (voir plus)oning across geometry and semantics. However, the vast majority of these methods lack any ability to perform reasoning across time. This is a desirable property in situations where a robot repeatedly observes an environment where instances may change in between observations, but in a structured way. Consider as an example a home environment where the location of a mug typically moves from the cupboard to a countertop to the sink and then back to the cupboard on a daily basis. We should be able to learn this cyclic behavior and use it to predict the state of the mug in the future. In this work, we propose a method that is able to perform this type of tempo-spatio-semantic reasoning. Underpinning the method is a filter, Perpetua

2026-04-29

arXiv (prépublication)

Constrained Group Relative Policy Optimization

Roger Girgis

Rodrigue De Schaetzen

Luke Rowe

Azalee Robitaille

Christopher Pal

2026-02-04

arXiv (prépublication)

SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration

Kaustubh Mani

Yann Pequignot

Vincent Mai

Safe exploration is a prerequisite for deploying reinforcement learning (RL) agents in safety-critical domains. In this paper, we approach s… (voir plus)afe exploration through the lens of epistemic uncertainty, where the actor’s sensitivity to parameter perturbations serves as a practical proxy for regions of high uncertainty. We propose Sharpness-Aware Policy Optimization (SHAPO), a sharpness-aware policy update rule that evaluates gradients at perturbed parameters, making policy updates pessimistic with respect to the actor’s epistemic uncertainty. Analytically we show that this adjustment implicitly reweighs policy gradients, amplifying the influence of rare unsafe actions while tempering contributions from already safe ones, thereby biasing learning toward conservative behavior in under-explored regions. Across several continuous-control tasks, our method consistently improves both safety and task performance over existing baselines, significantly expanding their Pareto frontiers.

2025-12-31

International Conference on Learning Representations (Accept (Poster))

Perpetua: Multi-Hypothesis Persistence Modeling for Semi-Static Environments

Miguel Saavedra-Ruiz

Samer B. Nashed

Charlie Gauthier

Many robotic systems require extended deployments in complex, dynamic environments. In such deployments, parts of the environment may change… (voir plus) between subsequent robot observations. Most robotic mapping or environment modeling algorithms are incapable of representing dynamic features in a way that enables predicting their future state. Instead, they opt to filter certain state observations, either by removing them or some form of weighted averaging. This paper introduces Perpetua, a method for modeling the dynamics of semi-static features. Perpetua is able to: incorporate prior knowledge about the dynamics of the feature if it exists, track multiple hypotheses, and adapt over time to enable predicting of future feature states. Specifically, we chain together mixtures of"persistence"and"emergence"filters to model the probability that features will disappear or reappear in a formal Bayesian framework. The approach is an efficient, scalable, general, and robust method for estimating the states of features in an environment, both in the present as well as at arbitrary future times. Through experiments on simulated and real-world data, we find that Perpetua yields better accuracy than similar approaches while also being online adaptable and robust to missing observations.

2025-10-18

2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (publié)

Object-Centric Agentic Robot Policies

Executing open-ended natural language queries in previously unseen environments is a core problem in robotics. While recent advances in imit… (voir plus)ation learning and vision-language modeling have enabled promising end-to-end policies, these models struggle when faced with complex instructions and new scenes. Their short input context also limits their ability to solve tasks over larger spatial horizons. In this work, we introduce OCARP, a modular agentic robot policy that executes user queries by using a library of tools on a dynamic inventory of objects. The agent builds the inventory by grounding query-relevant objects using a rich 3D map representation that includes open-vocabulary descriptors and 3D affordances. By combining the flexible reasoning abilities of an agent with a general spatial representation, OCARP can execute complex open-vocabulary queries in a zero-shot manner. We showcase how OCARP can be deployed in both tabletop and mobile settings due to the underlying scalable map representation.

2025-09-22

NeurIPS.cc/2025/Workshop/SpaVLE (poster)

OpenLex3D: A Tiered Evaluation Benchmark for Open-Vocabulary 3D Scene Representations

Christina Kassab

Martin Büchner

Matias Mattamala

Kumaraditya Gupta

Abhinav Valada

Maurice Fallon

3D scene understanding has been transformed by open-vocabulary language models that enable interaction via natural language. However, at pre… (voir plus)sent the evaluation of these representations is limited to datasets with closed-set semantics that do not capture the richness of language. This work presents OpenLex3D, a dedicated benchmark for evaluating 3D open-vocabulary scene representations. OpenLex3D provides entirely new label annotations for scenes from Replica, ScanNet++, and HM3D, which capture real-world linguistic variability by introducing synonymical object categories and additional nuanced descriptions. Our label sets provide 13 times more labels per scene than the original datasets. By introducing an open-set 3D semantic segmentation task and an object retrieval task, we evaluate various existing 3D open-vocabulary methods on OpenLex3D, showcasing failure cases, and avenues for improvement. Our experiments provide insights on feature precision, segmentation, and downstream capabilities. The benchmark is publicly available at: https://openlex3d.github.io/.

2025-09-17

NeurIPS.cc/2025/Datasets_and_Benchmarks_Track (poster)

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

Christopher Pal

Felix Heide

We introduce Scenario Dreamer, a fully data-driven generative simulator for autonomous vehicle planning that generates both the initial traf… (voir plus)fic scene - comprising a lane graph and agent bounding boxes - and closed-loop agent behaviours. Existing methods for generating driving simulation environments encode the initial traffic scene as a rasterized image and, as such, require parameter-heavy networks that perform unnecessary computation due to many empty pixels in the rasterized scene. Moreover, we find that existing methods that employ rule-based agent behaviours lack diversity and realism. Scenario Dreamer instead employs a novel vectorized latent diffusion model for initial scene generation that directly operates on the vectorized scene elements and an autoregressive Transformer for data-driven agent behaviour simulation. Scenario Dreamer additionally supports scene extrapolation via diffusion inpainting, enabling the generation of unbounded simulation environments. Extensive experiments show that Scenario Dreamer outperforms existing generative simulators in realism and efficiency: the vectorized scene-generation base model achieves superior generation quality with around 2x fewer parameters, 6x lower generation latency, and 10x fewer GPU training hours compared to the strongest baseline. We confirm its practical utility by showing that reinforcement learning planning agents are more challenged in Scenario Dreamer environments than traditional non-generative simulation environments, especially on long and adversarial driving environments.

2025-06-12

IEEE/CVF Conference on Computer Vision and Pattern Recognition (Accept)

Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving

Luke Rowe

Rodrigue De Schaetzen

Roger Girgis

Christopher Pal

We present Poutine, a 3B-parameter vision-language model (VLM) tailored for end-to-end autonomous driving in long-tail driving scenarios. Po… (voir plus)utine is trained in two stages. To obtain strong base driving capabilities, we train Poutine-Base in a self-supervised vision-language-trajectory (VLT) next-token prediction fashion on 83 hours of CoVLA nominal driving and 11 hours of Waymo long-tail driving. Accompanying language annotations are auto-generated with a 72B-parameter VLM. Poutine is obtained by fine-tuning Poutine-Base with Group Relative Policy Optimization (GRPO) using less than 500 preference-labeled frames from the Waymo validation set. We show that both VLT pretraining and RL fine-tuning are critical to attain strong driving performance in the long-tail. Poutine-Base achieves a rater-feedback score (RFS) of 8.12 on the validation set, nearly matching Waymo's expert ground-truth RFS. The final Poutine model achieves an RFS of 7.99 on the official Waymo test set, placing 1st in the 2025 Waymo Vision-Based End-to-End Driving Challenge by a significant margin. These results highlight the promise of scalable VLT pre-training and lightweight RL fine-tuning to enable robust and generalizable autonomy.

2025-06-11

ArXiv (prépublication)

Alexia Jolicoeur-Martineau

Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes

Ge Ya Luo

D. Nowrouzezahrai

Christopher Pal

Video diffusion techniques have advanced significantly in recent years; however, they struggle to generate realistic imagery of car crashes … (voir plus)due to the scarcity of accident events in most driving datasets. Improving traffic safety requires realistic and controllable accident simulations. To tackle the problem, we propose Ctrl-Crash, a controllable car crash video generation model that conditions on signals such as bounding boxes, crash types, and an initial image frame. Our approach enables counterfactual scenario generation where minor variations in input can lead to dramatically different crash outcomes. To support fine-grained control at inference time, we leverage classifier-free guidance with independently tunable scales for each conditioning signal. Ctrl-Crash achieves state-of-the-art performance across quantitative video quality metrics (e.g., FVD and JEDi) and qualitative measurements based on a human-evaluation of physical realism and video quality compared to prior diffusion-based methods.

2025-05-29

ArXiv (prépublication)

OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations

Christina Kassab

Martin Büchner

Matias Mattamala

Kumaraditya Gupta

Abhinav Valada

Maurice Fallon

2025-05-11

IEEE.org/ICRA/2025/Workshop/Safe-VLM (spotlight)

The Harmonic Exponential Filter for Nonparametric Estimation on Motion Groups

Miguel Saavedra-Ruiz

Steven A. Parkison

Ria Arora

James Richard Forbes

Bayesian estimation is a vital tool in robotics as it allows systems to update the robot state belief using incomplete information from nois… (voir plus)y sensors. To render the state estimation problem tractable, many systems assume that the motion and measurement noise, as well as the state distribution, are unimodal and Gaussian. However, there are numerous scenarios and systems that do not comply with these assumptions. Existing nonparametric filters that are used to model multimodal distributions have drawbacks that limit their ability to represent a diverse set of distributions. This paper introduces a novel approach to nonparametric Bayesian filtering on motion groups, designed to handle multimodal distributions using harmonic exponential distributions. This approach leverages two key insights of harmonic exponential distributions: a) the product of two distributions can be expressed as the element-wise addition of their log-likelihood Fourier coefficients, and b) the convolution of two distributions can be efficiently computed as the tensor product of their Fourier coefficients. These observations enable the development of an efficient and asymptotically exact solution to the Bayes filter up to the band limit of a Fourier transform. We demonstrate our filter's performance compared with established nonparametric filtering methods across simulated and real-world localization tasks.

2025-01-31

IEEE Robotics and Automation Letters (publié)