Liam Paull

alihusein.kuwajerwala@mila.quebec

Alihusein Kuwajerwala

PhD - Université de Montréal

Postdoctorate - Université de Montréal

andreafrancesco.daniele@mila.quebec

charlie.gauthier@mila.quebec

Charlie Gauthier

PhD - Université de Montréal

Co-supervisor :

Glen Berseth

Kaustubh Mani

PhD - Université de Montréal

kaustubh.mani@mila.quebec

Google Scholar

Luke Rowe

PhD - Université de Montréal

Co-supervisor :

Chris Pal

luke.rowe@mila.quebec

mahtab.sandhu@mila.quebec

Google Scholar

Mahtab Sandhu

Master's Research - Université de Montréal

miguel-angel.saavedra-ruiz@mila.quebec

Google Scholar

Manfred Diaz Cabrera

PhD - Université de Montréal

Miguel Angel Saavedra Ruiz

PhD - Université de Montréal

PhD - Université de Montréal

Ria Arora

Master's Research - Université de Montréal

Principal supervisor :

Guy Wolf

ria.arora@mila.quebec

yann.pequignot@mila.quebec

Ruixiang Zhang

PhD - Université de Montréal

Co-supervisor :

Yoshua Bengio

zhangrui@mila.quebec

Website

Sacha Morin

PhD - Université de Montréal

Principal supervisor :

Guy Wolf

sacha.morin@mila.quebec

Samer Nashed

Postdoctorate - Université de Montréal

samer.nashed@mila.quebec

Yann Pequignot

Collaborating researcher - Université Laval

Zhen Liu

PhD - Université de Montréal

Co-supervisor :

Yoshua Bengio

liuzhen@mila.quebec

Blog Posts

Visuel de l'Article sur la représentation du maillage non étanche de t-shirts

May 15, 2024

How to effectively and efficiently represent non-watertight meshes for your T-shirts?

Zhen Liu

Yao Feng

Yuliang Xiu

Weiyang Liu

Liam Paull

Michael J. Black

Bernhard Scholkopf

Read the article

May 9, 2022

Sample Efficient Deep Reinforcement Learning Via Uncertainty Estimation

Vincent Mai

Kaustubh Mani

Liam Paull

Read the article

November 19, 2020

La-MAML: Look-ahead Meta-Learning for Continual Learning

Gunshi Gupta

Liam Paull

Read the article

Publications

BACS: Background Aware Continual Semantic Segmentation

Mostafa ElAraby

Ali Harakeh

Semantic segmentation plays a crucial role in enabling comprehensive scene understanding for robotic systems. However, generating annotation… (see more)s is challenging, requiring labels for every pixel in an image. In scenarios like autonomous driving, there's a need to progressively incorporate new classes as the operating environment of the deployed agent becomes more complex. For enhanced annotation efficiency, ideally, only pixels belonging to new classes would be annotated. This approach is known as Continual Semantic Segmentation (CSS). Besides the common problem of classical catastrophic forgetting in the continual learning setting, CSS suffers from the inherent ambiguity of the background, a phenomenon we refer to as the"background shift'', since pixels labeled as background could correspond to future classes (forward background shift) or previous classes (backward background shift). As a result, continual learning approaches tend to fail. This paper proposes a Backward Background Shift Detector (BACS) to detect previously observed classes based on their distance in the latent space from the foreground centroids of previous steps. Moreover, we propose a modified version of the cross-entropy loss function, incorporating the BACS detector to down-weight background pixels associated with formerly observed classes. To combat catastrophic forgetting, we employ masked feature distillation alongside dark experience replay. Additionally, our approach includes a transformer decoder capable of adjusting to new classes without necessitating an additional classification head. We validate BACS's superior performance over existing state-of-the-art methods on standard CSS benchmarks.

2024-04-19

ArXiv (preprint)

Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

Manfred Diaz

Andrea Tacchetti

Teacher-Student Curriculum Learning (TSCL) is a curriculum learning framework that draws inspiration from human cultural transmission and le… (see more)arning. It involves a teacher algorithm shaping the learning process of a learner algorithm by exposing it to controlled experiences. Despite its success, understanding the conditions under which TSCL is effective remains challenging. In this paper, we propose a data-centric perspective to analyze the underlying mechanics of the teacher-student interactions in TSCL. We leverage cooperative game theory to describe how the composition of the set of experiences presented by the teacher to the learner, as well as their order, influences the performance of the curriculum that is found by TSCL approaches. To do so, we demonstrate that for every TSCL problem, there exists an equivalent cooperative game, and several key components of the TSCL framework can be reinterpreted using game-theoretic principles. Through experiments covering supervised learning, reinforcement learning, and classical games, we estimate the cooperative values of experiences and use value-proportional curriculum mechanisms to construct curricula, even in cases where TSCL struggles. The framework and experimental setup we present in this work represent a novel foundation for a deeper exploration of TSCL, shedding light on its underlying mechanisms and providing insights into its broader applicability in machine learning.

2024-04-03

ArXiv (preprint)

CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning

Luke Rowe

Roger Girgis

Anthony Gosselin

Bruno Carrez

Florian Golemo

Felix Heide

Chris Pal

Evaluating autonomous vehicle stacks (AVs) in simulation typically involves replaying driving logs from real-world recorded traffic. However… (see more), agents replayed from offline data do not react to the actions of the AV, and their behaviour cannot be easily controlled to simulate counterfactual scenarios. Existing approaches have attempted to address these shortcomings by proposing methods that rely on heuristics or learned generative models of real-world data but these approaches either lack realism or necessitate costly iterative sampling procedures to control the generated behaviours. In this work, we take an alternative approach and propose CtRL-Sim, a method that leverages return-conditioned offline reinforcement learning within a physics-enhanced Nocturne simulator to efficiently generate reactive and controllable traffic agents. Specifically, we process real-world driving data through the Nocturne simulator to generate a diverse offline reinforcement learning dataset, annotated with various reward terms. With this dataset, we train a return-conditioned multi-agent behaviour model that allows for fine-grained manipulation of agent behaviours by modifying the desired returns for the various reward components. This capability enables the generation of a wide range of driving behaviours beyond the scope of the initial dataset, including those representing adversarial behaviours. We demonstrate that CtRL-Sim can efficiently generate diverse and realistic safety-critical scenarios while providing fine-grained control over agent behaviours. Further, we show that fine-tuning our model on simulated safety-critical scenarios generated by our model enhances this controllability.

2024-03-29

ArXiv (preprint)

Correction to: Multi-agent reinforcement learning for fast-timescale demand response of residential loads

Vincent Mai

Philippe Maisonneuve

Tianyu Zhang

Hadi Nekoei

Antoine Lesage-Landry

2024-02-23

Machine-mediated learning (published)

Ghost on the Shell: An Expressive Representation of General 3D Shapes

Zhen Liu

Yao Feng

Yuliang Xiu

Weiyang Liu

Michael J. Black

Bernhard Schölkopf

2024-01-16

ICLR.cc/2024/Conference (oral)

GROOD: GRadient-aware Out-Of-Distribution detection in interpolated manifolds

Mostafa ElAraby

Sabyasachi Sahoo

Yann Pequignot

Paul Novello

2023-12-22

ArXiv (preprint)

Rethinking Teacher-Student Curriculum Learning under the Cooperative Mechanics of Experience

Manfred Diaz

Andrea Tacchetti

Teacher-Student Curriculum Learning (TSCL) is a curriculum learning framework that draws inspiration from human cultural transmission and le… (see more)arning. It involves a teacher algorithm shaping the learning process of a learner algorithm by exposing it to controlled experiences. Despite its success, understanding the conditions under which TSCL is effective remains challenging. In this paper, we propose a data-centric perspective to analyze the underlying mechanics of the teacher-student interactions in TSCL. We leverage cooperative game theory to describe how the composition of the set of experiences presented by the teacher to the learner, as well as their order, influences the performance of the curriculum that are found by TSCL approaches. To do so, we demonstrate that for every TSCL problem, there exists an equivalent cooperative game, and several key components of the TSCL framework can be reinterpreted using game-theoretic principles. Through experiments covering supervised learning, reinforcement learning, and classical games, we estimate the cooperative values of experiences and use value-proportional curriculum mechanisms to construct curricula, even in cases where TSCL struggles. The framework and experimental setup we present in this work represent a foundation that can be used for a deeper exploration of TSCL, shedding light on its underlying mechanisms and providing insights into its broader applicability in machine learning.

2023-10-28

NeurIPS.cc/2023/Workshop/ALOE (poster)

Ghost on the Shell: An Expressive Representation of General 3D Shapes

Zhen Liu

Yao Feng

Yuliang Xiu

Weiyang Liu

Michael J. Black

Bernhard Schölkopf

2023-10-23

ArXiv (preprint)

ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning

Qiao Gu

Alihusein Kuwajerwala

Sacha Morin

Krishna Murthy

Bipasha Sen

Aditya Agarwal

Corban Rivera

William Paul

Kirsty Ellis

Rama Chellappa

Chuang Gan

Celso M de Melo

Joshua B. Tenenbaum

Antonio Torralba

Florian Shkurti

For robots to perform a wide variety of tasks, they require a 3D representation of the world that is semantically rich, yet compact and effi… (see more)cient for task-driven perception and planning. Recent approaches have attempted to leverage features from large vision-language models to encode semantics in 3D representations. However, these approaches tend to produce maps with per-point feature vectors, which do not scale well in larger environments, nor do they contain semantic spatial relationships between entities in the environment, which are useful for downstream planning. In this work, we propose ConceptGraphs, an open-vocabulary graph-structured representation for 3D scenes. ConceptGraphs is built by leveraging 2D foundation models and fusing their output to 3D by multi-view association. The resulting representations generalize to novel semantic classes, without the need to collect large 3D datasets or finetune models. We demonstrate the utility of this representation through a number of downstream planning tasks that are specified through abstract (language) prompts and require complex reasoning over spatial and semantic concepts. (Project page: https://concept-graphs.github.io/ Explainer video: https://youtu.be/mRhNkQwRYnc )

2023-10-20

robot-learning.org/CoRL/2023/Workshop/LangRob (poster)

One-4-All: Neural Potential Fields for Embodied Navigation

Sacha Morin

Miguel Saavedra-Ruiz

A fundamental task in robotics is to navigate between two locations. In particular, real-world navigation can require long-horizon planning … (see more)using high-dimensional RGB images, which poses a substantial challenge for end-to-end learning-based approaches. Current semi-parametric methods instead achieve long-horizon navigation by combining learned modules with a topological memory of the environment, often represented as a graph over previously collected images. However, using these graphs in practice requires tuning a number of pruning heuristics. These heuristics are necessary to avoid spurious edges, limit runtime memory usage and maintain reasonably fast graph queries in large environments. In this work, we present One-4-All (O4A), a method leveraging self-supervised and manifold learning to obtain a graph-free, end-to-end navigation pipeline in which the goal is specified as an image. Navigation is achieved by greedily minimizing a potential function defined continuously over image embeddings. Our system is trained offline on non-expert exploration sequences of RGB data and controls, and does not require any depth or pose measurements. We show that 04A can reach long-range goals in 8 simulated Gibson indoor environments and that resulting embeddings are topologically similar to ground truth maps, even if no pose is observed. We further demonstrate successful real-world navigation using a Jackal UGV platform.aaProject page https://montrealrobotics.ca/o4a/.

2023-10-01

2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss

Anas Mahmoud

Jordan S. K. Hu

Tianshu Kuai

Ali Harakeh

Steven L. Waslander

An effective framework for learning 3D representations for perception tasks is distilling rich self-supervised image features via contrastiv… (see more)e learning. However, image-to-point representation learning for autonomous driving datasets faces two main challenges: 1) the abundance of self-similarity, which results in the contrastive losses pushing away semantically similar point and image regions and thus disturbing the local semantic structure of the learned representations, and 2) severe class imbalance as pretraining gets dominated by over-represented classes. We propose to alleviate the self-similarity problem through a novel semantically tolerant image-to-point contrastive loss that takes into consideration the semantic distance between positive and negative image regions to minimize contrasting semantically similar point and image regions. Additionally, we address class imbalance by designing a class-agnostic balanced loss that approximates the degree of class imbalance through an aggregate sample-to-samples semantic similarity measure. We demonstrate that our semantically-tolerant contrastive loss with class balancing improves state-of-the-art 2D-to-3D representation learning in all evaluation settings on 3D semantic segmentation. Our method consistently outperforms state-of-the-art 2D-to-3D representation learning frameworks across a wide range of 2D self-supervised pretrained models.

2023-06-17

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

ConceptFusion: Open-set Multimodal 3D Mapping

Krishna Murthy

Alihusein Kuwajerwala

Qiao Gu

Mohd Omama

Tao Chen

Shuang Li

Alaa Maalouf

Ganesh Subramanian Iyer

Soroush Saryazdi

Nikhil Varma Keetha

Ayush Tewari

Joshua B. Tenenbaum

Celso M de Melo

Madhava Krishna

Florian Shkurti

Antonio Torralba

Building 3D maps of the environment is central to robot navigation, planning, and interaction with objects in a scene. Most existing approac… (see more)hes that integrate semantic concepts with 3D maps largely remain confined to the closed-set setting: they can only reason about a finite set of concepts, pre-defined at training time. Further, these maps can only be queried using class labels, or in recent work, using text prompts. We address both these issues with ConceptFusion, a scene representation that is: (i) fundamentally open-set, enabling reasoning beyond a closed set of concepts (ii) inherently multi-modal, enabling a diverse range of possible queries to the 3D map, from language, to images, to audio, to 3D geometry, all working in concert. ConceptFusion leverages the open-set capabilities of today’s foundation models pre-trained on internet-scale data to reason about concepts across modalities such as natural language, images, and audio. We demonstrate that pixel-aligned open-set features can be fused into 3D maps via traditional SLAM and multi-view fusion approaches. This enables effective zero-shot spatial reasoning, not needing any additional training or finetuning, and retains long-tailed concepts better than supervised approaches, outperforming them by more than 40% margin on 3D IoU. We extensively evaluate ConceptFusion on a number of real-world datasets, simulated home environments, a real-world tabletop manipulation task, and an autonomous driving platform. We showcase new avenues for blending foundation models with 3D open-set multimodal mapping.

2023-05-06

ICRA.org/2023/Workshop/Pretraining4Robotics (published)