Portrait de David Meger

David Meger

Membre académique associé
Professeur adjoint, McGill University, École d'informatique
Sujets de recherche
Apprentissage par renforcement
Vision par ordinateur

Biographie

David Meger est professeur adjoint à l'École d'informatique de l'Université McGill. Il codirige le Laboratoire de robotique mobile au sein du Centre sur les machines intelligentes, qui est l'un des groupes de recherche en robotique les plus importants et les plus anciens du Canada. Les travaux de recherche du professeur Meger portent notamment sur les robots à guidage visuel dotés d'une vision et d'un apprentissage actifs, sur les modèles d'apprentissage par renforcement profond qui sont largement cités et utilisés par les chercheurs et l'industrie dans le monde entier, et sur la robotique de terrain, y compris les déploiements autonomes sous l'eau et sur la terre ferme. Il a été le président général de la première conférence conjointe CS-CAN au Canada en 2023.

Étudiants actuels

Doctorat - McGill
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill

Publications

Adaptive Confidence Calibration
Jonathan W. Pearce
IL-flOw: Imitation Learning from Observation using Normalizing Flows
Wei-Di Chang
Juan Higuera
Continuous MDP Homomorphisms and Homomorphic Policy Gradient
Abstraction has been widely studied as a way to improve the efficiency and generalization of reinforcement learning algorithms. In this pape… (voir plus)r, we study abstraction in the continuous-control setting. We extend the definition of MDP homomorphisms to encompass continuous actions in continuous state spaces. We derive a policy gradient theorem on the abstract MDP, which allows us to leverage approximate symmetries of the environment for policy optimization. Based on this theorem, we propose an actor-critic algorithm that is able to learn the policy and the MDP homomorphism map simultaneously, using the lax bisimulation metric. We demonstrate the effectiveness of our method on benchmark tasks in the DeepMind Control Suite. Our method's ability to utilize MDP homomorphisms for representation learning leads to improved performance when learning from pixel observations.
Learning Assisted Identification of Scenarios Where Network Optimization Algorithms Under-Perform
Dmitriy Rivkin
X. T. Chen
Xue Liu
We present a generative adversarial method that uses deep learning to identify network load traffic conditions in which network optimization… (voir plus) algorithms under-perform other known algorithms: the Deep Convolutional Failure Generator (DCFG). The spatial distribution of network load presents challenges for network operators for tasks such as load balancing, in which a network optimizer attempts to maintain high quality communication while at the same time abiding capacity constraints. Testing a network optimizer for all possible load distributions is challenging if not impossible. We propose a novel method that searches for load situations where a target network optimization method underperforms baseline, which are key test cases that can be used for future refinement and performance optimization. By modeling a realistic network simulator's quality assessments with a deep network and, in parallel, optimizing a load generation network, our method efficiently searches the high dimensional space of load patterns and reliably finds cases in which a target network optimization method under-performs a baseline by a significant margin.
Active 3D Shape Reconstruction from Vision and Touch
Edward J. Smith
Luis Pineda
Roberto Calandra
Jitendra Malik
Adriana Romero
Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch. However, in 3… (voir plus)D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings, leaving the active exploration of the shape largely unexplored. In active touch sensing for 3D reconstruction, the goal is to actively select the tactile readings that maximize the improvement in shape reconstruction accuracy. However, the development of deep learning-based active touch models is largely limited by the lack of frameworks for shape exploration. In this paper, we focus on this problem and introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile signals; and 3) a set of data-driven solutions with either tactile or visuotactile priors to guide the shape exploration. Our framework enables the development of the first fully data-driven solutions to active touch on top of learned models for object understanding. Our experiments show the benefits of such solutions in the task of 3D shape understanding where our models consistently outperform natural baselines. We provide our framework as a tool to foster future research in this direction.
Latent Attention Augmentation for Robust Autonomous Driving Policies
Ran Cheng
Christopher Agia
Florian Shkurti
Model-free reinforcement learning has become a viable approach for vision-based robot control. However, sample complexity and adaptability t… (voir plus)o domain shifts remain persistent challenges when operating in high-dimensional observation spaces (images, LiDAR), such as those that are involved in autonomous driving. In this paper, we propose a flexible framework by which a policy’s observations are augmented with robust attention representations in the latent space to guide the agent’s attention during training. Our method encodes local and global descriptors of the augmented state representations into a compact latent vector, and scene dynamics are approximated by a recurrent network that processes the latent vectors in sequence. We outline two approaches for constructing attention maps; a supervised pipeline leveraging semantic segmentation networks, and an unsupervised pipeline relying only on classical image processing techniques. We conduct our experiments in simulation and test the learned policy against varying seasonal effects and weather conditions. Our design decisions are supported in a series of ablation studies. The results demonstrate that our state augmentation method both improves learning efficiency and encourages robust domain adaptation when compared to common end-to-end frameworks and methods that learn directly from intermediate representations.
Trajectory-Constrained Deep Latent Visual Attention for Improved Local Planning in Presence of Heterogeneous Terrain
Stefan Wapnick
Travis Manderson
We present a reward-predictive, model-based deep learning method featuring trajectory-constrained visual attention for local planning in vis… (voir plus)ual navigation tasks. Our method learns to place visual attention at locations in latent image space which follow trajectories caused by vehicle control actions to enhance predictive accuracy during planning. The attention model is jointly optimized by the task-specific loss and an additional trajectory-constraint loss, allowing adaptability yet encouraging a regularized structure for improved generalization and reliability. Importantly, visual attention is applied in latent feature map space instead of raw image space to promote efficient planning. We validated our model in visual navigation tasks of planning low turbulence, collision-free trajectories in off-road settings and hill climbing with locking differentials in the presence of slippery terrain. Experiments involved randomized procedural generated simulation and real-world environments. We found our method improved generalization and learning efficiency when compared to no-attention and self-attention alternatives.
An Autonomous Probing System for Collecting Measurements at Depth from Small Surface Vehicles
Yuying Huang
Yiming Yao
Johanna Hansen
Jeremy Mallette
Sandeep Manjanna
This paper presents the portable autonomous probing system (APS), a low-cost robotic design for collecting water quality measurements at tar… (voir plus)geted depths from an autonomous surface vehicle (ASV). This system fills an important but often overlooked niche in marine sampling by enabling mobile sensor observations throughout the near-surface water column without the need for advanced underwater equipment. We present a probe delivery mechanism built with commercially available components and describe the corresponding open-source simulator and winch controller. Finally, we demonstrate the system in a field deployment and discuss design trade-offs and areas for future improvement. Project details are available on https://johannah.github.io/publication/sample-at-depth our website
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
Marginalized importance sampling (MIS), which measures the density ratio between the state-action occupancy of a target policy and that of a… (voir plus) sampling distribution, is a promising approach for off-policy evaluation. However, current state-of-the-art MIS methods rely on complex optimization tricks and succeed mostly on simple toy problems. We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy. The successor representation can be trained through deep reinforcement learning methodology and decouples the reward optimization from the dynamics of the environment, making the resulting algorithm stable and applicable to high-dimensional domains. We evaluate the empirical performance of our approach on a variety of challenging Atari and MuJoCo environments.
Multimodal dynamics modeling for off-road autonomous vehicles
Travis Manderson
Aurélio Noca
Dynamics modeling in outdoor and unstructured environments is difficult because different elements in the environment interact with the robo… (voir plus)t in ways that can be hard to predict. Leveraging multiple sensors to perceive maximal information about the robot's environment is thus crucial when building a model to perform predictions about the robot's dynamics with the goal of doing motion planning. We design a model capable of long-horizon motion predictions, leveraging vision, lidar and proprioception, which is robust to arbitrarily missing modalities at test time. We demonstrate in simulation that our model is able to leverage vision to predict traction changes. We then test our model using a real-world challenging dataset of a robot navigating through a forest, performing predictions in trajectories unseen during training. We try different modality combinations at test time and show that, while our model performs best when all modalities are present, it is still able to perform better than the baseline even when receiving only raw vision input and no proprioception, as well as when only receiving proprioception. Overall, our study demonstrates the importance of leveraging multiple sensors when doing dynamics modeling in outdoor conditions.
Learning Intuitive Physics with Multimodal Generative Models
Predicting the future interaction of objects when they come into contact with their environment is key for autonomous agents to take intelli… (voir plus)gent and anticipatory actions. This paper presents a perception framework that fuses visual and tactile feedback to make predictions about the expected motion of objects in dynamic scenes. Visual information captures object properties such as 3D shape and location, while tactile information provides critical cues about interaction forces and resulting object motion when it makes contact with the environment. Utilizing a novel See-Through-your-Skin (STS) sensor that provides high resolution multimodal sensing of contact surfaces, our system captures both the visual appearance and the tactile properties of objects. We interpret the dual stream signals from the sensor using a Multimodal Variational Autoencoder (MVAE), allowing us to capture both modalities of contacting objects and to develop a mapping from visual to tactile interaction and vice-versa. Additionally, the perceptual system can be used to infer the outcome of future physical interactions, which we validate through simulated and real-world experiments in which the resting state of an object is predicted from given initial conditions.
Robotic Object Manipulation with Full-Trajectory GAN-Based Imitation Learning
Haoxu Wang
This paper develops a novel generative imitation learning system capable of capturing the distribution of expert demonstrations in trajector… (voir plus)y space, which allows longer temporal context within complex motion sequences to be captured. While auto-regressive models that model time-steps sequentially can in principle be recursively applied to capture long sequences, there are known issues with learning such models reliably. In contrast, our model represents full trajectories a first-class entities, which has required us to adapt the typical generative adversarial learning architecture. We pair a full-trajectory discriminator with an imitation-inspired generative trajectory model and train these two in adversarial fashion. Our results show that our method matches the performance of existing approaches for simple tasks, in simulation and on real robot deployments. We produce state-of-the-art accuracy in replicating motions that contain long-term dependencies such as pouring.