Robotics

Robots are used worldwide in many industrial processes, and are getting better at helping humans every year. Machine learning algorithms are enhancing the capabilities of traditional robotics, and have become essential in making robots more adaptable to challenging situations.

People watch a robotic arm at work in a factory.

Embodied machine learning seeks to emulate the ways in which humans process information. By using a wide variety of sensors on robotic hardware, researchers are able to help robots perceive, analyze, interact, and navigate through unpredictable physical environments. Mila researchers are tackling challenges such as better long-term planning for the use of robots in daily life, building representations of the world — including simultaneous localization and mapping — while creating better workflows to teach robotic agents new tasks.

Mila’s work also includes designing experimental machine learning algorithms to help robots perform better in industrial applications such as assembly and disassembly, meal preparation, and warehouse management.

Featured Projects

Engineers working with medical robotic equipment.

DROID

DROID is an initiative that aims to address the scarcity of comprehensive datasets in robotics, enhancing the development of manipulation algorithms for real-world applications.

Learn more

Geometric shapes on a dark blue background.

ConceptGraphs

ConceptGraphs is a mapping system that builds 3D scene-graphs of objects and their relationships, enabling robots to perform complex navigation and object manipulation tasks.

Learn more

AI can help us make robots more adaptable to unpredictable environments, which will lead to true robotics assistants in the real world.

Glen Berseth, Assistant Professor, Université de Montréal, Core Academic Member, Mila

Research Labs

Mila professors exploring the subject as part of their research.

Mila Faculty

Associate Academic Member

Narges Armanfard

Associate Professor, McGill University, Department of Electrical and Computer Engineering

View profile

Core Academic Member

Glen Berseth

Assistant Professor, Université de Montréal, Department of Computer Science and Operations Research

Canada CIFAR AI Chair

View profile

Associate Academic Member

Gregory Dudek

Full Professor and Research Director of Mobile Robotics Lab, McGill University, School of Computer Science

View profile

Affiliate Member

Samira Ebrahimi Kahou

Associate Professor, University of Calgary, Deparment of Electrical and Software Engineering

View profile

Core Academic Member

Amir-massoud Farahmand

Associate Professor, Polytechnique Montréal

View profile

Associate Academic Member

Toby Dylan Hocking

Associate Professor, Université Sherbrooke, Department of Computer Science

View profile

Associate Academic Member

Xue (Steve) Liu

Full Professor, McGill University, School of Computer Science

View profile

Associate Academic Member

David Meger

Associate Professor, McGill University, School of Computer Science

View profile

Core Academic Member

AJung Moon

Associate professor, McGill University, Department of Electrical and Computer Engineering

View profile

Associate Academic Member

Eilif B. Muller

Assistant Professor, Université de Montréal, Department of Neurosciences

Canada CIFAR AI Chair

View profile

Core Academic Member

Derek Nowrouzezahrai

Associate Professor, McGill University, Department of Electrical and Computer Engineering

Canada CIFAR AI Chair

View profile

Associate Academic Member

Borke Obada-Obieh

Assistant Professor, McGill University, School of Computer Science

View profile

Core Academic Member

Chris Pal

Full Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering

Canada CIFAR AI Chair

View profile

Core Academic Member

Liam Paull

Assistant Professor, Université de Montréal, Department of Computer Science and Operations Research

Canada CIFAR AI Chair

View profile

Associate Academic Member

Jana Pavlasek

Polytechnique Montréal, Department of Computer and Software Engineering

View profile

Affiliate Member

Louis Petit

Assistant Professor, Université de Sherbrooke, Department of Electrical and Computer Engineering

View profile

Core Academic Member

Doina Precup

Associate Professor, McGill University, School of Computer Science

Canada CIFAR AI Chair

View profile

Associate Academic Member

Isabeau Prémont-Schwarz

Assistant Professor, Université Laval, Computer science and software engineering

View profile

Associate Academic Member

Audrey Sedal

Assistant Professor, McGill University, Department of Mechanical Engineering

View profile

Affiliate Member

Inna Sharf

Full Professor, McGill University, Department of Mechanical Engineering

View profile

Associate Academic Member

Kaleem Siddiqi

Professor, McGill University, School of Computer Science

View profile

Associate Academic Member

Yang Wang

Associate Professor, Concordia University, Computer science and software engineering

View profile

Affiliate Member

Hanqing Zhao

Assistant Professor, Université Laval, Electrical and Computer Engineering

View profile

Featured Video

Prof. Glen Berseth studies how machine learning can be used to train more adaptable robots that could help humanity meet its most pressing challenges.

Publications

ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning

Qiao Gu

Alihusein Kuwajerwala

Sacha Morin

Krishna Murthy

Bipasha Sen

Aditya Agarwal

Corban Rivera

William Paul

Kirsty Ellis

Rama Chellappa

Chuang Gan

Celso M de Melo

Joshua B. Tenenbaum

Antonio Torralba

Florian Shkurti

Liam Paull

For robots to perform a wide variety of tasks, they require a 3D representation of the world that is semantically rich, yet compact and effi… (see more)cient for task-driven perception and planning. Recent approaches have attempted to leverage features from large vision-language models to encode semantics in 3D representations. However, these approaches tend to produce maps with per-point feature vectors, which do not scale well in larger environments, nor do they contain semantic spatial relationships between entities in the environment, which are useful for downstream planning. In this work, we propose ConceptGraphs, an open-vocabulary graph-structured representation for 3D scenes. ConceptGraphs is built by leveraging 2D foundation models and fusing their output to 3D by multi-view association. The resulting representations generalize to novel semantic classes, without the need to collect large 3D datasets or finetune models. We demonstrate the utility of this representation through a number of downstream planning tasks that are specified through abstract (language) prompts and require complex reasoning over spatial and semantic concepts. (Project page: https://concept-graphs.github.io/ Explainer video: https://youtu.be/mRhNkQwRYnc )

2024-05-13

2024 IEEE International Conference on Robotics and Automation (ICRA) (published)

doi.org

openreview.net

ConceptFusion: Open-set Multimodal 3D Mapping

Krishna Murthy

Alihusein Kuwajerwala

Qiao Gu

Mohd Omama

Tao Chen

Alaa Maalouf

Shuang Li

Ganesh Subramanian Iyer

Soroush Saryazdi

Nikhil Varma Keetha

Ayush Tewari

Joshua B. Tenenbaum

Celso M de Melo

Madhava Krishna

Liam Paull

Florian Shkurti

Antonio Torralba

Building 3D maps of the environment is central to robot navigation, planning, and interaction with objects in a scene. Most existing approac… (see more)hes that integrate semantic concepts with 3D maps largely remain confined to the closed-set setting: they can only reason about a finite set of concepts, pre-defined at training time. Further, these maps can only be queried using class labels, or in recent work, using text prompts. We address both these issues with ConceptFusion, a scene representation that is: (i) fundamentally open-set, enabling reasoning beyond a closed set of concepts (ii) inherently multi-modal, enabling a diverse range of possible queries to the 3D map, from language, to images, to audio, to 3D geometry, all working in concert. ConceptFusion leverages the open-set capabilities of today’s foundation models pre-trained on internet-scale data to reason about concepts across modalities such as natural language, images, and audio. We demonstrate that pixel-aligned open-set features can be fused into 3D maps via traditional SLAM and multi-view fusion approaches. This enables effective zero-shot spatial reasoning, not needing any additional training or finetuning, and retains long-tailed concepts better than supervised approaches, outperforming them by more than 40% margin on 3D IoU. We extensively evaluate ConceptFusion on a number of real-world datasets, simulated home environments, a real-world tabletop manipulation task, and an autonomous driving platform. We showcase new avenues for blending foundation models with 3D open-set multimodal mapping.

2023-05-06

ICRA.org/2023/Workshop/Pretraining4Robotics (published)

doi.org

openreview.net

Hierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal Robot

Yandong Ji

Zhongyu Li

Yinan Sun

Xue Bin Peng

Sergey Levine

Glen Berseth

Koushil Sreenath

We address the problem of enabling quadrupedal robots to perform precise shooting skills in the real world using reinforcement learning. Dev… (see more)eloping algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task. To solve this problem, we need to consider the dynamics limitation and motion stability during the control of a dynamic legged robot. Moreover, we need to consider motion planning to shoot the hard-to-model deformable ball rolling on the ground with uncertain friction to a desired location. In this paper, we propose a hierarchical framework that leverages deep reinforcement learning to train (a) a robust motion control policy that can track arbitrary motions and (b) a planning policy to decide the desired kicking motion to shoot a soccer ball to a target. We deploy the proposed framework on an A1 quadrupedal robot and enable it to accurately shoot the ball to random targets in the real world.

2022-10-23

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

doi.org

arxiv.org

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Alexander Khazatsky

Karl Pertsch

Suraj Nair

Ashwin Balakrishna

Sudeep Dasari

Siddharth Karamcheti

Soroush Nasiriany

Mohan Kumar Srirama

Lawrence Yunliang Chen

Kirsty Ellis

Peter David Fagan

Joey Hejna

Masha Itkina

Marion Lepert

Yecheng Jason Ma

Ye Ma

Patrick Tree Miller

Jimmy Wu

Suneel Belkhale

Shivin Dass … (see 82 more)

Huy Ha

Arhan Jain

Abraham Lee

Youngwoon Lee

Marius Memmel

Sungjae Park

Ilija Radosavovic

Kaiyuan Wang

Albert Zhan

Kevin Black

Cheng Chi

Kyle Beltran Hatch

Shan Lin

Jingpei Lu

Jean Mercat

Abdul Rehman

Pannag R Sanketi

Archit Sharma

Cody Simpson

Quan Vuong

Homer Rich Walke

Blake Wulfe

Ted Xiao

Jonathan Heewon Yang

Arefeh Yavary

Tony Z. Zhao

Christopher Agia

Rohan Baijal

Mateo Guaman Castro

Daphne Chen

Qiuyu Chen

Trinity Chung

Jaimyn Drake

Ethan Paul Foster

Jensen Gao

David Antonio Herrera

Minho Heo

Kyle Hsu

Jiaheng Hu

Muhammad Zubair Irshad

Donovon Jackson

Charlotte Le

Xinyu Lin

Yunshuang Li

K. Lin

Roy Lin

Zehan Ma

Abhiram Maddukuri

Suvir Mirchandani

Daniel Morton

Tony Khuong Nguyen

Abigail O'Neill

Rosario Scalise

Derick Seale

Victor Son

Stephen Tian

Emi Tran

Andrew E. Wang

Yilin Wu

Annie Xie

Jingyun Yang

Patrick Yin

Yunchu Zhang

Osbert Bastani

Glen Berseth

Jeannette Bohg

Ken Goldberg

Abhinav Gupta

Abhishek Gupta

Dinesh Jayaraman

Joseph J Lim

Jitendra Malik

Roberto Martín-Martín

Subramanian Ramamoorthy

Dorsa Sadigh

Shuran Song

Jiajun Wu

Michael C. Yip

Yuke Zhu

Thomas Kollar

Sergey Levine

Chelsea Finn

The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and … (see more)robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.

2024-07-15

Robotics: Science and Systems XX (published)

doi.org

openreview.net

See more publications