Portrait de David Meger

David Meger

Membre académique associé
Professeur adjoint, McGill University, École d'informatique
Sujets de recherche
Apprentissage par renforcement
Vision par ordinateur

Biographie

David Meger est professeur adjoint à l'École d'informatique de l'Université McGill. Il codirige le Laboratoire de robotique mobile au sein du Centre sur les machines intelligentes, qui est l'un des groupes de recherche en robotique les plus importants et les plus anciens du Canada. Les travaux de recherche du professeur Meger portent notamment sur les robots à guidage visuel dotés d'une vision et d'un apprentissage actifs, sur les modèles d'apprentissage par renforcement profond qui sont largement cités et utilisés par les chercheurs et l'industrie dans le monde entier, et sur la robotique de terrain, y compris les déploiements autonomes sous l'eau et sur la terre ferme. Il a été le président général de la première conférence conjointe CS-CAN au Canada en 2023.

Étudiants actuels

Doctorat - McGill
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill

Publications

Tactile Modality Fusion for Vision-Language-Action Models
We propose TacFiLM, a lightweight modality-fusion approach that integrates visual-tactile signals into vision-language-action (VLA) models. … (voir plus)While recent advances in VLA models have introduced robot policies that are both generalizable and semantically grounded, these models mainly rely on vision-based perception. Vision alone, however, cannot capture the complex interaction dynamics that occur during contact-rich manipulation, including contact forces, surface friction, compliance, and shear. While recent attempts to integrate tactile signals into VLA models often increase complexity through token concatenation or large-scale pretraining, the heavy computational demands of behavioural models necessitate more lightweight fusion strategies. To address these challenges, TacFiLM outlines a post-training finetuning approach that conditions intermediate visual features on pretrained tactile representations using feature-wise linear modulation (FiLM). Experimental results on insertion tasks demonstrate consistent improvements in success rate, direct insertion performance, completion time, and force stability across both in-distribution and out-of-distribution tasks. Together, these results support our method as an effective approach to integrating tactile signals into VLA models, improving contact-rich manipulation behaviours.
VOCALoco: Viability-Optimized Cost-aware Adaptive Locomotion
Recent advancements in legged robot locomotion have facilitated traversal over increasingly complex terrains. Despite this progress, many ex… (voir plus)isting approaches rely on end-to-end deep reinforcement learning (DRL), which poses limitations in terms of safety and interpretability, especially when generalizing to novel terrains. To overcome these challenges, we introduce VOCALoco, a modular skill-selection framework that dynamically adapts locomotion strategies based on perceptual input. Given a set of pre-trained locomotion policies, VOCALoco evaluates their viability and energy-consumption by predicting both the safety of execution and the anticipated cost of transport over a fixed planning horizon. This joint assessment enables the selection of policies that are both safe and energy-efficient, given the observed local terrain. We evaluate our approach on staircase locomotion tasks, demonstrating its performance in both simulated and real-world scenarios using a quadrupedal robot. Empirical results show that VOCALoco achieves improved robustness and safety during stair ascent and descent compared to a conventional end-to-end DRL policy
Contractive Diffusion Policies
Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characte… (voir plus)rized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce **C**ontractive **D**iffusion **P**olicies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity. Project page: https://contractive-diffusion.github.io
Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations
Charlotte Morissette
Anas El Houssaini
Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characte… (voir plus)rized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce Contractive Diffusion Policies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real-world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity.
Large Pre-Trained Models for Bimanual Manipulation in 3D
Convergence Theorems for Entropy-Regularized and Distributional Reinforcement Learning
Yash Jhaveri
Patrick Shafto
Bellemare Marc-Emmanuel
In the pursuit of finding an optimal policy, reinforcement learning (RL) methods generally ignore the properties of learned policies apart f… (voir plus)rom their expected return. Thus, even when successful, it is difficult to characterize which policies will be learned and what they will do. In this work, we present a theoretical framework for policy optimization that guarantees convergence to a particular optimal policy, via vanishing entropy regularization and a temperature decoupling gambit. Our approach realizes an interpretable, diversity-preserving optimal policy as the regularization temperature vanishes and ensures the convergence of policy derived objects--value functions and return distributions. In a particular instance of our method, for example, the realized policy samples all optimal actions uniformly. Leveraging our temperature decoupling gambit, we present an algorithm that estimates, to arbitrary accuracy, the return distribution associated to its interpretable, diversity-preserving optimal policy.
Epistemic Uncertainty Estimation in Regression Ensemble Models with Pairwise Epistemic Estimators
Lucas Berry
This work introduces a novel approach, Pairwise Epistemic Estimators (PairEpEsts), for epistemic uncertainty estimation in ensemble models f… (voir plus)or regression tasks using pairwise-distance estimators (PaiDEs). By utilizing the pairwise distances between model components, PaiDEs establish bounds on entropy. We leverage this capability to enhance the performance of Bayesian Active Learning by Disagreement (BALD). Notably, unlike sample-based Monte Carlo estimators, PairEpEsts can estimate epistemic uncertainty up to 100 times faster and demonstrate superior performance in higher dimensions. To validate our approach, we conducted a varied series of regression experiments on commonly used benchmarks: 1D sinusoidal data, *Pendulum*, *Hopper*, *Ant*, and *Humanoid*, demonstrating PairEpEsts’ advantage over baselines in high-dimensional regression active learning.
Generalizable Imitation Learning Through Pre-Trained Representations
Wei-Di Chang
Francois Hogan
In this paper we leverage self-supervised vision transformer models and their emergent semantic abilities to improve the generalization abil… (voir plus)ities of imitation learning policies. We introduce BC-ViT, an imitation learning algorithm that leverages rich DINO pre-trained Visual Transformer (ViT) patch-level embeddings to obtain better generalization when learning through demonstrations. Our learner sees the world by clustering appearance features into semantic concepts, forming stable keypoints that generalize across a wide range of appearance variations and object types. We show that this representation enables generalized behaviour by evaluating imitation learning across a diverse dataset of object manipulation tasks. Our method, data and evaluation approach are made available to facilitate further study of generalization in Imitation Learners.
Topological mapping for traversability-aware long-range navigation in off-road terrain
Autonomous robots navigating in off-road terrain like forests open new opportunities for automation. While off-road navigation has been stud… (voir plus)ied, existing work often relies on clearly delineated pathways. We present a method allowing for long-range planning, exploration and low-level control in unknown off-trail forest terrain, using vision and GPS only. We represent outdoor terrain with a topological map, which is a set of panoramic snapshots connected with edges containing traversability information. A novel traversability analysis method is demonstrated, predicting the existence of a safe path towards a target in an image. Navigating between nodes is done using goal-conditioned behavior cloning, leveraging the power of a pretrained vision transformer. An exploration planner is presented, efficiently covering an unknown off-road area with unknown traversability using a frontiers-based approach. The approach is successfully deployed to autonomously explore two 400 m2 forest sites unseen during training, in difficult conditions for navigation.
Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models
Lucas Berry
Axel Brando
Wei-Di Chang
Juan Higuera
Tractable Representations for Convergent Approximation of Distributional HJB Equations
Programmable Shape‐Preserving Soft Robotics Arm via Multimodal Multistability (Adv. Funct. Mater. 6/2025)
Benyamin Shahryari
Hossein Mofatteh
Armin Mirabolghasemi
Abdolhamid Akbarzadeh