Portrait of Gregory Dudek is unavailable

Gregory Dudek

Associate Academic Member
Full Professor and Research Director of Mobile Robotics Lab, McGill University, School of Computer Science
Vice President and Lab Head of AI Research, Samsung AI Center in Montréal

Biography

Gregory Dudek is a full professor at McGill University’s CIM which is linked to the School of Computer Science and Research Director of Mobile Robotics Lab. He is also the Lab Director and VP of research at Samsung AI Center Montreal and an Associate academic member at Mila - Quebec Institute of Artificial Intelligence.

Dudek has authored and co-authored over 300 research publications on a wide range of subjects, including visual object description, recognition, RF localization, robotic navigation and mapping, distributed system design, 5G telecommunications and biological perception.

He co-authored the book “Computational Principles of Mobile Robotics” (Cambridge University Press) with Michael Jenkin. He has chaired and been involved in numerous national and international conferences and professional activities concerned with robotics, machine sensing and computer vision.

Dudek’s research interests include perception for mobile robotics, navigation and position estimation, environment and shape modelling, computational vision and collaborative filtering.

Current Students

PhD - McGill University
Principal supervisor :
Master's Research - McGill University
Principal supervisor :

Publications

Eliminating Space Scanning: Fast mmWave Beam Alignment with UWB Radios
Ju Wang
X. T. Chen
Xue Liu
Due to their large bandwidth and impressive data speed, millimeter-wave (mmWave) radios are expected to play a key role in the 5G and beyond… (see more) (e.g., 6G) communication networks. Yet, to release mmWave’s true power, the highly directional mmWave beams need to be aligned perfectly. Most existing beam alignment methods adopt an exhaustive or semi-exhaustive space scanning, which introduces up to seconds of delays. To eliminate the need for complex space scanning, this article presents an Ultra-wideband (UWB)-assisted mmWave communication framework, which leverages the co-located UWB antennas to estimate the best angles for mmWave beam alignment. One major challenge of applying this idea in the real world is the barrier of limited antenna numbers. Commercial-Off-The-Shelf (COTS) devices are usually equipped with only a small number of UWB antennas, which are not enough for the existing algorithms to provide an accurate angle estimation. To solve this challenge, we design a novel Multi-Frequency MUltiple SIgnal Classification (MF-MUSIC) algorithm, which extends the classic MUltiple SIgnal Classification (MUSIC) algorithm to the frequency domain and overcomes the antenna limitation barrier in the spatial domain. Extensive real-world experiments and numerical simulations illustrate the advantage of the proposed MF-MUSIC algorithm. MF-MUSIC uses only three antennas to achieve an accurate angle estimation, which is a mere 0.15° (or a relative difference of 3.6%) different from the state-of-the-art 16-antenna-based angle estimation method.
Augmenting Transit Network Design Algorithms with Deep Learning
Andrew Holliday
This paper considers the use of deep learning models to enhance optimization algorithms for transit network design. Transit network design i… (see more)s the problem of determining routes for transit vehicles that minimize travel time and operating costs, while achieving full service coverage. State-of-the-art meta-heuristic search algorithms give good results on this problem, but can be very time-consuming. In contrast, neural networks can learn sub-optimal but fast-to-compute heuristics based on large amounts of data. Combining these approaches, we develop a fast graph neural network model for transit planning, and use it to initialize state-of-the-art search algorithms. We show that this combination can improve the results of these algorithms on a variety of metrics by up to 17%, without increasing their run time; or they can match the quality of the original algorithms while reducing the computing time by up to a factor of 50.
Bayesian Q-learning With Imperfect Expert Demonstrations
Guided exploration with expert demonstrations improves data efficiency for reinforcement learning, but current algorithms often overuse expe… (see more)rt information. We propose a novel algorithm to speed up Q-learning with the help of a limited amount of imperfect expert demonstrations. The algorithm avoids excessive reliance on expert data by relaxing the optimal expert assumption and gradually reducing the usage of uninformative expert data. Experimentally, we evaluate our approach on a sparse-reward chain environment and six more complicated Atari games with delayed rewards. With the proposed methods, we can achieve better results than Deep Q-learning from Demonstrations (Hester et al., 2017) in most environments.
IL-flOw: Imitation Learning from Observation using Normalizing Flows
Wei-Di Chang
Juan Higuera
Learning Assisted Identification of Scenarios Where Network Optimization Algorithms Under-Perform
Dmitriy Rivkin
X. T. Chen
Xue Liu
We present a generative adversarial method that uses deep learning to identify network load traffic conditions in which network optimization… (see more) algorithms under-perform other known algorithms: the Deep Convolutional Failure Generator (DCFG). The spatial distribution of network load presents challenges for network operators for tasks such as load balancing, in which a network optimizer attempts to maintain high quality communication while at the same time abiding capacity constraints. Testing a network optimizer for all possible load distributions is challenging if not impossible. We propose a novel method that searches for load situations where a target network optimization method underperforms baseline, which are key test cases that can be used for future refinement and performance optimization. By modeling a realistic network simulator's quality assessments with a deep network and, in parallel, optimizing a load generation network, our method efficiently searches the high dimensional space of load patterns and reliably finds cases in which a target network optimization method under-performs a baseline by a significant margin.
Latent Attention Augmentation for Robust Autonomous Driving Policies
Ran Cheng
Christopher Agia
Florian Shkurti
Model-free reinforcement learning has become a viable approach for vision-based robot control. However, sample complexity and adaptability t… (see more)o domain shifts remain persistent challenges when operating in high-dimensional observation spaces (images, LiDAR), such as those that are involved in autonomous driving. In this paper, we propose a flexible framework by which a policy’s observations are augmented with robust attention representations in the latent space to guide the agent’s attention during training. Our method encodes local and global descriptors of the augmented state representations into a compact latent vector, and scene dynamics are approximated by a recurrent network that processes the latent vectors in sequence. We outline two approaches for constructing attention maps; a supervised pipeline leveraging semantic segmentation networks, and an unsupervised pipeline relying only on classical image processing techniques. We conduct our experiments in simulation and test the learned policy against varying seasonal effects and weather conditions. Our design decisions are supported in a series of ablation studies. The results demonstrate that our state augmentation method both improves learning efficiency and encourages robust domain adaptation when compared to common end-to-end frameworks and methods that learn directly from intermediate representations.
Trajectory-Constrained Deep Latent Visual Attention for Improved Local Planning in Presence of Heterogeneous Terrain
Stefan Wapnick
Travis Manderson
We present a reward-predictive, model-based deep learning method featuring trajectory-constrained visual attention for local planning in vis… (see more)ual navigation tasks. Our method learns to place visual attention at locations in latent image space which follow trajectories caused by vehicle control actions to enhance predictive accuracy during planning. The attention model is jointly optimized by the task-specific loss and an additional trajectory-constraint loss, allowing adaptability yet encouraging a regularized structure for improved generalization and reliability. Importantly, visual attention is applied in latent feature map space instead of raw image space to promote efficient planning. We validated our model in visual navigation tasks of planning low turbulence, collision-free trajectories in off-road settings and hill climbing with locking differentials in the presence of slippery terrain. Experiments involved randomized procedural generated simulation and real-world environments. We found our method improved generalization and learning efficiency when compared to no-attention and self-attention alternatives.
An Autonomous Probing System for Collecting Measurements at Depth from Small Surface Vehicles
Yuying Huang
Yiming Yao
Johanna Hansen
Jeremy Mallette
Sandeep Manjanna
This paper presents the portable autonomous probing system (APS), a low-cost robotic design for collecting water quality measurements at tar… (see more)geted depths from an autonomous surface vehicle (ASV). This system fills an important but often overlooked niche in marine sampling by enabling mobile sensor observations throughout the near-surface water column without the need for advanced underwater equipment. We present a probe delivery mechanism built with commercially available components and describe the corresponding open-source simulator and winch controller. Finally, we demonstrate the system in a field deployment and discuss design trade-offs and areas for future improvement. Project details are available on https://johannah.github.io/publication/sample-at-depth our website
Multimodal dynamics modeling for off-road autonomous vehicles
Travis Manderson
Aurélio Noca
Dynamics modeling in outdoor and unstructured environments is difficult because different elements in the environment interact with the robo… (see more)t in ways that can be hard to predict. Leveraging multiple sensors to perceive maximal information about the robot's environment is thus crucial when building a model to perform predictions about the robot's dynamics with the goal of doing motion planning. We design a model capable of long-horizon motion predictions, leveraging vision, lidar and proprioception, which is robust to arbitrarily missing modalities at test time. We demonstrate in simulation that our model is able to leverage vision to predict traction changes. We then test our model using a real-world challenging dataset of a robot navigating through a forest, performing predictions in trajectories unseen during training. We try different modality combinations at test time and show that, while our model performs best when all modalities are present, it is still able to perform better than the baseline even when receiving only raw vision input and no proprioception, as well as when only receiving proprioception. Overall, our study demonstrates the importance of leveraging multiple sensors when doing dynamics modeling in outdoor conditions.
Learning Intuitive Physics with Multimodal Generative Models
Predicting the future interaction of objects when they come into contact with their environment is key for autonomous agents to take intelli… (see more)gent and anticipatory actions. This paper presents a perception framework that fuses visual and tactile feedback to make predictions about the expected motion of objects in dynamic scenes. Visual information captures object properties such as 3D shape and location, while tactile information provides critical cues about interaction forces and resulting object motion when it makes contact with the environment. Utilizing a novel See-Through-your-Skin (STS) sensor that provides high resolution multimodal sensing of contact surfaces, our system captures both the visual appearance and the tactile properties of objects. We interpret the dual stream signals from the sensor using a Multimodal Variational Autoencoder (MVAE), allowing us to capture both modalities of contacting objects and to develop a mapping from visual to tactile interaction and vice-versa. Additionally, the perceptual system can be used to infer the outcome of future physical interactions, which we validate through simulated and real-world experiments in which the resting state of an object is predicted from given initial conditions.
Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor
Francois Hogan
M. Jenkin
Yogesh Girdhar
We introduce a new class of vision-based sensor and associated algorithmic processes that combine visual imaging with high-resolution tactil… (see more)e sending, all in a uniform hardware and computational architecture. We demonstrate the sensor’s efficacy for both multi-modal object recognition and metrology. Object recognition is typically formulated as an unimodal task, but by combining two sensor modalities we show that we can achieve several significant performance improvements. This sensor, named the See-Through-your-Skin sensor (STS), is designed to provide rich multi-modal sensing of contact surfaces. Inspired by recent developments in optical tactile sensing technology, we address a key missing feature of these sensors: the ability to capture a visual perspective of the region beyond the contact surface. Whereas optical tactile sensors are typically opaque, we present a sensor with a semitransparent skin that has the dual capabilities of acting as a tactile sensor and/or as a visual camera depending on its internal lighting conditions. This paper details the design of the sensor, showcases its dual sensing capabilities, and presents a deep learning architecture that fuses vision and touch. We validate the ability of the sensor to classify household objects, recognize fine textures, and infer their physical properties both through numerical simulations and experiments with a smart countertop prototype.
MBAIL: Multi-Batch Best Action Imitation Learning utilizing Sample Transfer and Policy Distillation
Dingwei Wu
M. Jenkin
Steve Liu
Batch reinforcement learning (RL) aims to learn a good control policy from a previously collected dataset without requiring additional inter… (see more)actions with the environment. Unfortunately, in the real world, we may only have a limited amount of training data for tasks we are interested in. Most batch RL methods are intended to learn a policy over one fixed dataset, and are not intended to learn a policy that can perform well over other tasks. How can we leverage the advantages of batch RL while dealing with limited training data is another challenge in real world. In this work, we propose to add sample transfer and policy distillation to a leading Batch RL approach. The proposed methods are evaluated on multiple control tasks to showcase their effectiveness.