David Meger

Valliappan Chidambaram Adaikkappan

PhD - McGill University

Google Scholar

Wesley Chung

PhD - McGill University

Co-supervisor :

Doina Precup

Farnoosh Faraji

PhD - McGill University

Co-supervisor :

Master's Research - McGill University

Co-supervisor :

Hsiu-Chin Lin

Zina Kamel

Master's Research - McGill University

Co-supervisor :

Hsiu-Chin Lin

Sahand Rezaei-Shoshtari

PhD - McGill University

Principal supervisor :

PhD - McGill University

Junming(Clark) Shi

Master's Research - McGill University

Steven Wang

Master's Research - McGill University

Harley Wiltzer

PhD - McGill University

Co-supervisor :

Marc Gendron-Bellemare

PhD - McGill University

Publications

Active 3D Shape Reconstruction from Vision and Touch

Edward J. Smith

Luis Pineda

Roberto Calandra

Jitendra Malik

Adriana Romero Soriano

Michal Drozdzal

Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch. However, in 3… (see more)D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings, leaving the active exploration of the shape largely unexplored. In active touch sensing for 3D reconstruction, the goal is to actively select the tactile readings that maximize the improvement in shape reconstruction accuracy. However, the development of deep learning-based active touch models is largely limited by the lack of frameworks for shape exploration. In this paper, we focus on this problem and introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile signals; and 3) a set of data-driven solutions with either tactile or visuotactile priors to guide the shape exploration. Our framework enables the development of the first fully data-driven solutions to active touch on top of learned models for object understanding. Our experiments show the benefits of such solutions in the task of 3D shape understanding where our models consistently outperform natural baselines. We provide our framework as a tool to foster future research in this direction.

openreview.net

Trajectory-Constrained Deep Latent Visual Attention for Improved Local Planning in Presence of Heterogeneous Terrain

Stefan Wapnick

Travis Manderson

We present a reward-predictive, model-based learning method featuring trajectory-constrained visual attention for use in mapless, local visu… (see more)al navigation tasks. Our method learns to place visual attention at locations in latent image space which follow trajectories caused by vehicle control actions to later enhance predictive accuracy during planning. Our attention model is jointly optimized by the task-specific loss and additional trajectory-constraint loss, allowing adaptability yet encouraging a regularized structure for improved generalization and reliability. Importantly, visual attention is applied in latent feature map space instead of raw image space to promote efficient planning. We validated our model in visual navigation tasks of planning low turbulence, collision-free trajectories in off-road settings and hill climbing with locking differentials in the presence of slippery terrain. Experiments involved randomized procedural generated simulation and real-world environments. We found our method improved generalization and learning efficiency when compared to no-attention and self-attention alternatives.

2021-10-01

2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

Latent Attention Augmentation for Robust Autonomous Driving Policies

Ran Cheng

Christopher Agia

Florian Shkurti

Model-free reinforcement learning has become a viable approach for vision-based robot control. However, sample complexity and adaptability t… (see more)o domain shifts remain persistent challenges when operating in high-dimensional observation spaces (images, LiDAR), such as those that are involved in autonomous driving. In this paper, we propose a flexible framework by which a policy’s observations are augmented with robust attention representations in the latent space to guide the agent’s attention during training. Our method encodes local and global descriptors of the augmented state representations into a compact latent vector, and scene dynamics are approximated by a recurrent network that processes the latent vectors in sequence. We outline two approaches for constructing attention maps; a supervised pipeline leveraging semantic segmentation networks, and an unsupervised pipeline relying only on classical image processing techniques. We conduct our experiments in simulation and test the learned policy against varying seasonal effects and weather conditions. Our design decisions are supported in a series of ablation studies. The results demonstrate that our state augmentation method both improves learning efficiency and encourages robust domain adaptation when compared to common end-to-end frameworks and methods that learn directly from intermediate representations.

2021-09-27

IEEE/RJS International Conference on Intelligent RObots and Systems (published)

An Autonomous Probing System for Collecting Measurements at Depth from Small Surface Vehicles

Yuying Huang

Yiming Yao

Johanna Hansen

Jeremy Mallette

Sandeep Manjanna

2021-09-20

OCEANS 2021: San Diego – Porto (published)

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

Scott Fujimoto

Doina Precup

Marginalized importance sampling (MIS), which measures the density ratio between the state-action occupancy of a target policy and that of a… (see more) sampling distribution, is a promising approach for off-policy evaluation. However, current state-of-the-art MIS methods rely on complex optimization tricks and succeed mostly on simple toy problems. We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy. The successor representation can be trained through deep reinforcement learning methodology and decouples the reward optimization from the dynamics of the environment, making the resulting algorithm stable and applicable to high-dimensional domains. We evaluate the empirical performance of our approach on a variety of challenging Atari and MuJoCo environments.

2021-07-01

Proceedings of the 38th International Conference on Machine Learning (published)

proceedings.mlr.press

Multimodal dynamics modeling for off-road autonomous vehicles

Jean-François Tremblay

Travis Manderson

Aurélio Noca

Dynamics modeling in outdoor and unstructured environments is difficult because different elements in the environment interact with the robo… (see more)t in ways that can be hard to predict. Leveraging multiple sensors to perceive maximal information about the robot’s environment is thus crucial when building a model to perform predictions about the robot’s dynamics with the goal of doing motion planning. We design a model capable of long-horizon motion predictions, leveraging vision, lidar and proprioception, which is robust to arbitrarily missing modalities at test time. We demonstrate in simulation that our model is able to leverage vision to predict traction changes. We then test our model using a real-world challenging dataset of a robot navigating through a forest, performing predictions in trajectories unseen during training. We try different modality combinations at test time and show that, while our model performs best when all modalities are present, it is still able to perform better than the baseline even when receiving only raw vision input and no proprioception, as well as when only receiving proprioception. Overall, our study demonstrates the importance of leveraging multiple sensors when doing dynamics modeling in outdoor conditions.

2021-06-05

2021 IEEE International Conference on Robotics and Automation (ICRA) (published)

Learning Intuitive Physics with Multimodal Generative Models

Sahand Rezaei-Shoshtari

Francois Hogan

M. Jenkin

Predicting the future interaction of objects when they come into contact with their environment is key for autonomous agents to take intelli… (see more)gent and anticipatory actions. This paper presents a perception framework that fuses visual and tactile feedback to make predictions about the expected motion of objects in dynamic scenes. Visual information captures object properties such as 3D shape and location, while tactile information provides critical cues about interaction forces and resulting object motion when it makes contact with the environment. Utilizing a novel See-Through-your-Skin (STS) sensor that provides high resolution multimodal sensing of contact surfaces, our system captures both the visual appearance and the tactile properties of objects. We interpret the dual stream signals from the sensor using a Multimodal Variational Autoencoder (MVAE), allowing us to capture both modalities of contacting objects and to develop a mapping from visual to tactile interaction and vice-versa. Additionally, the perceptual system can be used to infer the outcome of future physical interactions, which we validate through simulated and real-world experiments in which the resting state of an object is predicted from given initial conditions.

2021-05-18

Proceedings of the AAAI Conference on Artificial Intelligence (published)

Learning Domain Randomization Distributions for Training Robust Locomotion Policies

Melissa Mozian

Juan Higuera

This paper considers the problem of learning behaviors in simulation without knowledge of the precise dynamical properties of the target rob… (see more)ot platform(s). In this context, our learning goal is to mutually maximize task efficacy on each environment considered and generalization across the widest possible range of environmental conditions. The physical parameters of the simulator are modified by a component of our technique that learns the Domain Randomization (DR) that is appropriate at each learning epoch to maximally challenge the current behavior policy, without being overly challenging, which can hinder learning progress. This so-called sweet spot distribution is a selection of simulated domains with the following properties: 1) The trained policy should be successful in environments sampled from the domain randomization distribution; and 2) The DR distribution made as wide as possible, to increase variability in the environments. These properties aim to ensure the trajectories encountered in the target system are close to those observed during training, as existing methods in machine learning are better suited for interpolation than extrapolation. We show how adapting the DR distribution while training context-conditioned policies results in improvements on jump-start and asymptotic performance when transferring a learned policy to the target environment1.

2021-01-24

2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

Intervention Design for Effective Sim2Real Transfer

Melissa Mozian

Amy Zhang

Joelle Pineau

The goal of this work is to address the recent success of domain randomization and data augmentation for the sim2real setting. We explain th… (see more)is success through the lens of causal inference, positioning domain randomization and data augmentation as interventions on the environment which encourage invariance to irrelevant features. Such interventions include visual perturbations that have no effect on reward and dynamics. This encourages the learning algorithm to be robust to these types of variations and learn to attend to the true causal mechanisms for solving the task. This connection leads to two key findings: (1) perturbations to the environment do not have to be realistic, but merely show variation along dimensions that also vary in the real world, and (2) use of an explicit invariance-inducing objective improves generalization in sim2sim and sim2real transfer settings over just data augmentation or domain randomization alone. We demonstrate the capability of our method by performing zero-shot transfer of a robot arm reach task on a 7DoF Jaco arm learning from pixel observations.

2020-12-03

ArXiv (preprint)

Urban Night Scenery Reconstruction by Day-night Registration and Synthesis

Andi Dai

2020-11-03

SIGSPATIAL/GIS (published)

Vision-Based Goal-Conditioned Policies for Underwater Navigation in the Presence of Obstacles

Travis Manderson

Juan Higuera

Stefan Wapnick

Jean-François Tremblay

Florian Shkurti

We present Nav2Goal, a data-efficient and end-to-end learning method for goal-conditioned visual navigation. Our technique is used to train … (see more)a navigation policy that enables a robot to navigate close to sparse geographic waypoints provided by a user without any prior map, all while avoiding obstacles and choosing paths that cover user-informed regions of interest. Our approach is based on recent advances in conditional imitation learning. General-purpose, safe and informative actions are demonstrated by a human expert. The learned policy is subsequently extended to be goal-conditioned by training with hindsight relabelling, guided by the robot's relative localization system, which requires no additional manual annotation. We deployed our method on an underwater vehicle in the open ocean to collect scientifically relevant data of coral reefs, which allowed our robot to operate safely and autonomously, even at very close proximity to the coral. Our field deployments have demonstrated over a kilometer of autonomous visual navigation, where the robot reaches on the order of 40 waypoints, while collecting scientifically relevant data. This is done while travelling within 0.5 m altitude from sensitive corals and exhibiting significant learned agility to overcome turbulent ocean conditions and to actively avoid collisions.

2020-07-12

Robotics: Science and Systems XVI (published)

Navigation in the Service of Enhanced Pose Estimation

Travis Manderson

Ran Cheng

2020-01-23

Springer Proceedings in Advanced Robotics (published)