Portrait de Liam Paull

Liam Paull

Membre académique principal

paulll@mila.quebec

Chaire en IA Canada-CIFAR

Professeur adjoint, Université de Montréal, Département d'informatique et de recherche opérationnelle

Sujets de recherche

Apprentissage profond

Robotique

Vision par ordinateur

Biographie

Liam Paull est professeur adjoint à l'Université de Montréal et codirige le Laboratoire de robotique et d’IA intégrative de Montréal (REAL). Son laboratoire se concentre sur les problèmes de robotique, y compris la construction de représentations du monde (pour la localisation et la cartographie simultanées, par exemple), la modélisation de l'incertitude et la construction de meilleurs flux de travail pour enseigner de nouvelles tâches aux agents robotiques (notamment par la simulation ou la démonstration). Auparavant, Liam Paull a été chercheur au Computer Science and Artificial Intelligence Laboratory (CSAIL) du Massachusetts Institute of Technology (MIT), où il a dirigé le projet de voiture autonome financé par le Toyota Research Institute (TRI). Il a également été chercheur postdoctoral au laboratoire de robotique marine du MIT, où il a travaillé sur la technique SLAM (Simultaneous Localization and Mapping) pour les robots sous-marins. Il a obtenu son doctorat en 2013 à l'Université du Nouveau-Brunswick : il s’y est intéressé à la planification robuste et adaptative pour les véhicules sous-marins. Il est cofondateur et directeur de la Fondation Duckietown, dont l'objectif est de rendre accessibles à tous·tes les expériences d'apprentissage de la robotique.

Étudiants actuels

Francesco Argenziano

Visiteur de recherche indépendant - Sapienza

Maîtrise recherche - UdeM

Superviseur⋅e principal⋅e :

Maîtrise recherche - UdeM

Rodrigue De Schaetzen

Doctorat - UdeM

Manfred Diaz Cabrera

Doctorat - UdeM

Moustafa Elarabi

Doctorat - UdeM

Charlie Gauthier

Doctorat - UdeM

Co-superviseur⋅e :

Collaborateur·rice de recherche - UdeM

Co-superviseur⋅e :

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - Université Laval

Azalee Robitaille

Maîtrise recherche - UdeM

azalee.robitaille@hotmail.com

Doctorat - UdeM

Co-superviseur⋅e :

Miguel Angel Saavedra Ruiz

Doctorat - UdeM

miguel-angel.saavedra-ruiz@mila.quebec

Maîtrise recherche - UdeM

Billets de blogue

Visuel de l'Article sur la représentation du maillage non étanche de t-shirts

15 mai 2024

Comment représenter efficacement le maillage non étanche de t-shirts?

par

Zhen Liu

Yao Feng

Yuliang Xiu

Weiyang Liu

Liam Paull

Michael J. Black

Bernhard Scholkopf

Sample Efficient Deep Reinforcement Learning Via Uncertainty Estimation

9 mai 2022

Estimation d’incertitude pour un apprentissage par renforcement profond plus efficient

par

Vincent Mai

Liam Paull

La-MAML: Look-ahead Meta-Learning for Continual Learning

19 novembre 2021

Méta-apprentissage prospectif pour l’apprentissage continu (La-MAML)

par

Gunshi Gupta

Liam Paull

Publications

Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss

Anas Mahmoud

Jordan S. K. Hu

Tianshu Kuai

Ali Harakeh

Steven L. Waslander

An effective framework for learning 3D representations for perception tasks is distilling rich self-supervised image features via contrastiv… (voir plus)e learning. However, image-to-point representation learning for autonomous driving datasets faces two main challenges: 1) the abundance of self-similarity, which results in the contrastive losses pushing away semantically similar point and image regions and thus disturbing the local semantic structure of the learned representations, and 2) severe class imbalance as pretraining gets dominated by over-represented classes. We propose to alleviate the self-similarity problem through a novel semantically tolerant image-to-point contrastive loss that takes into consideration the semantic distance between positive and negative image regions to minimize contrasting semantically similar point and image regions. Additionally, we address class imbalance by designing a class-agnostic balanced loss that approximates the degree of class imbalance through an aggregate sample-to-samples semantic similarity measure. We demonstrate that our semantically-tolerant contrastive loss with class balancing improves state-of-the-art 2D-to-3D representation learning in all evaluation settings on 3D semantic segmentation. Our method consistently outperforms state-of-the-art 2D-to-3D representation learning frameworks across a wide range of 2D self-supervised pretrained models.

2023-06-17

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (publié)

ConceptFusion: Open-set Multimodal 3D Mapping

Krishna Murthy

Alihusein Kuwajerwala

Qiao Gu

Mohd Omama

Tao Chen

Alaa Maalouf

Shuang Li

Ganesh Subramanian Iyer

Soroush Saryazdi

Nikhil Varma Keetha

Ayush Tewari

Joshua B. Tenenbaum

Celso M de Melo

Madhava Krishna

Florian Shkurti

Antonio Torralba

Building 3D maps of the environment is central to robot navigation, planning, and interaction with objects in a scene. Most existing approac… (voir plus)hes that integrate semantic concepts with 3D maps largely remain confined to the closed-set setting: they can only reason about a finite set of concepts, pre-defined at training time. Further, these maps can only be queried using class labels, or in recent work, using text prompts. We address both these issues with ConceptFusion, a scene representation that is: (i) fundamentally open-set, enabling reasoning beyond a closed set of concepts (ii) inherently multi-modal, enabling a diverse range of possible queries to the 3D map, from language, to images, to audio, to 3D geometry, all working in concert. ConceptFusion leverages the open-set capabilities of today’s foundation models pre-trained on internet-scale data to reason about concepts across modalities such as natural language, images, and audio. We demonstrate that pixel-aligned open-set features can be fused into 3D maps via traditional SLAM and multi-view fusion approaches. This enables effective zero-shot spatial reasoning, not needing any additional training or finetuning, and retains long-tailed concepts better than supervised approaches, outperforming them by more than 40% margin on 3D IoU. We extensively evaluate ConceptFusion on a number of real-world datasets, simulated home environments, a real-world tabletop manipulation task, and an autonomous driving platform. We showcase new avenues for blending foundation models with 3D open-set multimodal mapping.

2023-05-06

ICRA.org/2023/Workshop/Pretraining4Robotics (publié)

MeshDiffusion: Score-based Generative 3D Mesh Modeling

Zhen Liu

Yao Feng

Michael J. Black

Derek Nowrouzezahrai

Weiyang Liu

We consider the task of generating realistic 3D shapes, which is useful for a variety of applications such as automatic scene generation and… (voir plus) physical simulation. Compared to other 3D representations like voxels and point clouds, meshes are more desirable in practice, because (1) they enable easy and arbitrary manipulation of shapes for relighting and simulation, and (2) they can fully leverage the power of modern graphics pipelines which are mostly optimized for meshes. Previous scalable methods for generating meshes typically rely on sub-optimal post-processing, and they tend to produce overly-smooth or noisy surfaces without fine-grained geometric details. To overcome these shortcomings, we take advantage of the graph structure of meshes and use a simple yet very effective generative modeling method to generate 3D meshes. Specifically, we represent meshes with deformable tetrahedral grids, and then train a diffusion model on this direct parameterization. We demonstrate the effectiveness of our model on multiple generative tasks.

2023-02-01

ICLR.cc/2023/Conference (notable)

Robust and Controllable Object-Centric Learning through Energy-based Models

Ruixiang ZHANG

Tong Che

Boris Ivanovic

Renhao Wang

Marco Pavone

Humans are remarkably good at understanding and reasoning about complex visual scenes. The capability of decomposing low-level observations … (voir plus)into discrete objects allows us to build a grounded abstract representation and identify the compositional structure of the world. Thus it is a crucial step for machine learning models to be capable of inferring objects and their properties from visual scene without explicit supervision. However, existing works on object-centric representation learning are either relying on tailor-made neural network modules or assuming sophisticated models of underlying generative and inference processes. In this work, we present EGO, a conceptually simple and general approach to learning object-centric representation through energy-based model. By forming a permutation-invariant energy function using vanilla attention blocks that are readily available in Transformers, we can infer object-centric latent variables via gradient-based MCMC methods where permutation equivariance is automatically guaranteed. We show that EGO can be easily integrated into existing architectures, and can effectively extract high-quality object-centric representations, leading to better segmentation accuracy and competitive downstream task performance. We empirically evaluate the robustness of the learned representation from EGO against distribution shift. Finally, we demonstrate the effectiveness of EGO in systematic compositional generalization, by recomposing learned energy functions for novel scene generation and manipulation.

2023-02-01

ICLR.cc/2023/Conference (poster)

GROOD: GRadient-aware Out-Of-Distribution detection in interpolated manifolds

Mostafa ElAraby

Sabyasachi Sahoo

Yann Batiste Pequignot

Paul Novello

2023-01-01

arXiv.org (prépublication)

Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response of Residential Loads

Vincent Mai

Philippe Maisonneuve

Tianyu Zhang

Hadi Nekoei

Antoine Lesage-Landry

To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale var… (voir plus)iations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints struggle to provide satisfactory performance for fast timescale action selection with hundreds of agents. We propose a decentralized agent trained with multi-agent proximal policy optimization with localized communication. We explore two communication frameworks: hand-engineered, or learned through targeted multi-agent communication. The resulting policies perform well and robustly for frequency regulation, and scale seamlessly to arbitrary numbers of houses for constant processing times.

2023-01-01

AAMAS (publié)

Lifelong Topological Visual Navigation

Rey Reza Wiyatno

Anqi Xu

Commonly, learning-based topological navigation approaches produce a local policy while preserving some loose connectivity of the space thro… (voir plus)ugh a topological map. Nevertheless, spurious or missing edges in the topological graph often lead to navigation failure. In this work, we propose a sampling-based graph building method, which results in sparser graphs yet with higher navigation performance compared to baseline methods. We also propose graph maintenance strategies that eliminate spurious edges and expand the graph as needed, which improves lifelong navigation performance. Unlike controllers that learn from fixed training environments, we show that our model can be fine-tuned using only a small number of collected trajectory images from a real-world environment where the agent is deployed. We demonstrate successful navigation after fine-tuning on real-world environments, and notably show significant navigation improvements over time by applying our lifelong graph maintenance strategies.

2022-10-01

IEEE Robotics and Automation Letters (publié)

Monocular Robot Navigation with Self-Supervised Pretrained Vision Transformers

Miguel Saavedra-Ruiz

Sacha Morin

In this work, we consider the problem of learning a perception model for monocular robot navigation using few annotated images. Using a Visi… (voir plus)on Transformer (ViT) pretrained with a label-free self-supervised method, we successfully train a coarse image segmentation model for the Duckietown environment using 70 training images. Our model performs coarse image segmentation at the

2022-06-02

2022 19th Conference on Robots and Vision (CRV) (publié)

Monocular Robot Navigation with Self-Supervised Pretrained Vision Transformers

Miguel Saavedra-Ruiz

Sacha Morin

In this work, we consider the problem of learning a perception model for monocular robot navigation using few annotated images. Using a Visi… (voir plus)on Transformer (ViT) pretrained with a label-free self-supervised method, we successfully train a coarse image segmentation model for the Duckietown environment using 70 training images. Our model performs coarse image segmentation at the

2022-03-07

ArXiv (preprint)

Lifelong Topological Visual Navigation

Rey Reza Wiyatno

Anqi Xu

Commonly, learning-based topological navigation approaches produce a local policy while preserving some loose connectivity of the space thro… (voir plus)ugh a topological map. Nevertheless, spurious or missing edges in the topological graph often lead to navigation failure. In this work, we propose a sampling-based graph building method, which results in sparser graphs yet with higher navigation performance compared to baseline methods. We also propose graph maintenance strategies that eliminate spurious edges and expand the graph as needed, which improves lifelong navigation performance. Unlike controllers that learn from fixed training environments, we show that our model can be fine-tuned using only a small number of collected trajectory images from a real-world environment where the agent is deployed. We demonstrate successful navigation after fine-tuning on real-world environments, and notably show significant navigation improvements over time by applying our lifelong graph maintenance strategies.

2021-10-16

ArXiv (preprint)

Perceptual Generative Autoencoders

Zijun Zhang

Ruixiang ZHANG

Zongpeng Li

Modern generative models are usually designed to match target distributions directly in the data space, where the intrinsic dimension of dat… (voir plus)a can be much lower than the ambient dimension. We argue that this discrepancy may contribute to the difficulties in training generative models. We therefore propose to map both the generated and target distributions to a latent space using the encoder of a standard autoencoder, and train the generator (or decoder) to match the target distribution in the latent space. Specifically, we enforce the consistency in both the data space and the latent space with theoretically justified data and latent reconstruction losses. The resulting generative model, which we call a perceptual generative autoencoder (PGA), is then trained with a maximum likelihood or variational autoencoder (VAE) objective. With maximum likelihood, PGAs generalize the idea of reversible generative models to unrestricted neural network architectures and arbitrary number of latent dimensions. When combined with VAEs, PGAs substantially improve over the baseline VAEs in terms of sample quality. Compared to other autoencoder-based generative models using simple priors, PGAs achieve state-of-the-art FID scores on CIFAR-10 and CelebA.

2020-11-21

Proceedings of the 37th International Conference on Machine Learning (publié)

proceedings.mlr.press

Active Domain Randomization

Bhairav Mehta

Manfred Diaz

Florian Golemo

Domain randomization is a popular technique for improving domain transfer, often used in a zero-shot setting when the target domain is unkno… (voir plus)wn or cannot easily be used for training. In this work, we empirically examine the effects of domain randomization on agent generalization. Our experiments show that domain randomization may lead to suboptimal, high-variance policies, which we attribute to the uniform sampling of environment parameters. We propose Active Domain Randomization, a novel algorithm that learns a parameter sampling strategy. Our method looks for the most informative environment variations within the given randomization ranges by leveraging the discrepancies of policy rollouts in randomized and reference environment instances. We find that training more frequently on these instances leads to better overall agent generalization. In addition, when domain randomization and policy transfer fail, Active Domain Randomization offers more insight into the deficiencies of both the chosen parameter ranges and the learned policy, allowing for more focused debugging. Our experiments across various physics-based simulated and a real-robot task show that this enhancement leads to more robust, consistent policies.

2020-05-12

Proceedings of the Conference on Robot Learning (publié)

proceedings.mlr.press