Portrait de David Meger

David Meger

Membre académique associé
Professeur adjoint, McGill University, École d'informatique
Sujets de recherche
Apprentissage par renforcement
Vision par ordinateur

Biographie

David Meger est professeur adjoint à l'École d'informatique de l'Université McGill. Il codirige le Laboratoire de robotique mobile au sein du Centre sur les machines intelligentes, qui est l'un des groupes de recherche en robotique les plus importants et les plus anciens du Canada. Les travaux de recherche du professeur Meger portent notamment sur les robots à guidage visuel dotés d'une vision et d'un apprentissage actifs, sur les modèles d'apprentissage par renforcement profond qui sont largement cités et utilisés par les chercheurs et l'industrie dans le monde entier, et sur la robotique de terrain, y compris les déploiements autonomes sous l'eau et sur la terre ferme. Il a été le président général de la première conférence conjointe CS-CAN au Canada en 2023.

Étudiants actuels

Maîtrise recherche - McGill
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Postdoctorat - McGill
Doctorat - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill

Publications

Shedding Light on Large Generative Networks: Estimating Epistemic Uncertainty in Diffusion Models
Lucas Berry
Axel Brando
Generative diffusion models, notable for their large parameter count (exceeding 100 million) and operation within high-dimensional image spa… (voir plus)ces, pose significant challenges for traditional uncertainty estimation methods due to computational demands. In this work, we introduce an innovative framework, Diffusion Ensembles for Capturing Uncertainty (DECU), designed for estimating epistemic uncertainty for diffusion models. The DECU framework introduces a novel method that efficiently trains ensembles of conditional diffusion models by incorporating a static set of pre-trained parameters, drastically reducing the computational burden and the number of parameters that require training. Additionally, DECU employs Pairwise-Distance Estimators (PaiDEs) to accurately measure epistemic uncertainty by evaluating the mutual information between model outputs and weights in high-dimensional spaces. The effectiveness of this framework is demonstrated through experiments on the ImageNet dataset, highlighting its capability to capture epistemic uncertainty, specifically in under-sampled image classes.
Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of Adverbs
Faraz Lotfi
Farnoosh Faraji
Nikhil Kakodkar
Travis Manderson
Policy Gradient Methods in the Presence of Symmetries and State Abstractions
Sahand Rezaei-Shoshtari
Rosie Zhao
Uncertainty-aware hybrid paradigm of nonlinear MPC and model-based RL for offroad navigation: Exploration of transformers in the predictive model
Faraz Lotfi
Khalil Virji
Farnoosh Faraji
Lucas Berry
Andrew Holliday
In this paper, we investigate a hybrid scheme that combines nonlinear model predictive control (MPC) and model-based reinforcement learning … (voir plus)(RL) for navigation planning of an autonomous model car across offroad, unstructured terrains without relying on predefined maps. Our innovative approach takes inspiration from BADGR, an LSTM-based network that primarily concentrates on environment modeling, but distinguishes itself by substituting LSTM modules with transformers to greatly elevate the performance our model. Addressing uncertainty within the system, we train an ensemble of predictive models and estimate the mutual information between model weights and outputs, facilitating dynamic horizon planning through the introduction of variable speeds. Further enhancing our methodology, we incorporate a nonlinear MPC controller that accounts for the intricacies of the vehicle's model and states. The model-based RL facet produces steering angles and quantifies inherent uncertainty. At the same time, the nonlinear MPC suggests optimal throttle settings, striking a balance between goal attainment speed and managing model uncertainty influenced by velocity. In the conducted studies, our approach excels over the existing baseline by consistently achieving higher metric values in predicting future events and seamlessly integrating the vehicle's kinematic model for enhanced decision-making. The code and the evaluation data are available at https://github.com/FARAZLOTFI/offroad_autonomous_navigation/).
Generalizable Imitation Learning Through Pre-Trained Representations
Wei-Di Chang
Francois R. Hogan
In this paper we leverage self-supervised vision transformer models and their emergent semantic abilities to improve the generalization abil… (voir plus)ities of imitation learning policies. We introduce BC-ViT, an imitation learning algorithm that leverages rich DINO pre-trained Visual Transformer (ViT) patch-level embeddings to obtain better generalization when learning through demonstrations. Our learner sees the world by clustering appearance features into semantic concepts, forming stable keypoints that generalize across a wide range of appearance variations and object types. We show that this representation enables generalized behaviour by evaluating imitation learning across a diverse dataset of object manipulation tasks. Our method, data and evaluation approach are made available to facilitate further study of generalization in Imitation Learners.
Imitation Learning from Observation through Optimal Transport
Wei-Di Chang
Scott Fujimoto
For SALE: State-Action Representation Learning for Deep Reinforcement Learning
Scott Fujimoto
Wei-Di Chang
Edward J. Smith
Shixiang Shane Gu
In the field of reinforcement learning (RL), representation learning is a proven tool for complex image-based tasks, but is often overlooked… (voir plus) for environments with low-level states, such as physical control problems. This paper introduces SALE, a novel approach for learning embeddings that model the nuanced interaction between state and action, enabling effective representation learning from low-level states. We extensively study the design space of these embeddings and highlight important design considerations. We integrate SALE and an adaptation of checkpoints for RL into TD3 to form the TD7 algorithm, which significantly outperforms existing continuous control algorithms. On OpenAI gym benchmark tasks, TD7 has an average performance gain of 276.7% and 50.7% over TD3 at 300k and 5M time steps, respectively, and works in both the online and offline settings.
Leveraging World Model Disentanglement in Value-Based Multi-Agent Reinforcement Learning
Zhizun Wang
In this paper, we propose a novel model-based multi-agent reinforcement learning approach named Value Decomposition Framework with Disentang… (voir plus)led World Model to address the challenge of achieving a common goal of multiple agents interacting in the same environment with reduced sample complexity. Due to scalability and non-stationarity problems posed by multi-agent systems, model-free methods rely on a considerable number of samples for training. In contrast, we use a modularized world model, composed of action-conditioned, action-free, and static branches, to unravel the environment dynamics and produce imagined outcomes based on past experience, without sampling directly from the real environment. We employ variational auto-encoders and variational graph auto-encoders to learn the latent representations for the world model, which is merged with a value-based framework to predict the joint action-value function and optimize the overall training objective. We present experimental results in Easy, Hard, and Super-Hard StarCraft II micro-management challenges to demonstrate that our method achieves high sample efficiency and exhibits superior performance in defeating the enemy armies compared to other baselines.
Efficient Epistemic Uncertainty Estimation in Regression Ensemble Models Using Pairwise-Distance Estimators
Lucas Berry
This work introduces an efficient novel approach for epistemic uncertainty estimation for ensemble models for regression tasks using pairwis… (voir plus)e-distance estimators (PaiDEs). Utilizing the pairwise-distance between model components, these estimators establish bounds on entropy. We leverage this capability to enhance the performance of Bayesian Active Learning by Disagreement (BALD). Notably, unlike sample-based Monte Carlo estimators, PaiDEs exhibit a remarkable capability to estimate epistemic uncertainty at speeds up to 100 times faster while covering a significantly larger number of inputs at once and demonstrating superior performance in higher dimensions. To validate our approach, we conducted a varied series of regression experiments on commonly used benchmarks: 1D sinusoidal data,
Hypernetworks for Zero-shot Transfer in Reinforcement Learning
Sahand Rezaei-Shoshtari
Charlotte Morissette
Francois Hogan
In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objec… (voir plus)tive and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.
ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence
Dmitriy Rivkin
Nikhil Kakodkar
Oliver Limoyo
Francois Hogan
Our work examines the way in which large language models can be used for robotic planning and sampling in the context of automated photograp… (voir plus)hic documentation. Specifically, we illustrate how to produce a photo-taking robot with an exceptional level of semantic awareness by leveraging recent advances in general purpose language (LM) and vision-language (VLM) models. Given a high-level description of an event we use an LM to generate a natural-language list of photo descriptions that one would expect a photographer to capture at the event. We then use a VLM to identify the best matches to these descriptions in the robot's video stream. The photo portfolios generated by our method are consistently rated as more appropriate to the event by human evaluators than those generated by existing methods.
Normalizing Flow Ensembles for Rich Aleatoric and Epistemic Uncertainty Modeling
Lucas Berry