Portrait of Abhinav Gupta

Abhinav Gupta

Alumni

Publications

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
Siddharth Karamcheti
Soroush Nasiriany
Mohan Kumar Srirama
Lawrence Yunliang Chen
Peter David Fagan
Joey Hejna
Masha Itkina
Marion Lepert
Yecheng Jason Ma
Ye Ma
Patrick Tree Miller
Jimmy Wu
Suneel Belkhale
Shivin Dass … (see 82 more)
Huy Ha
Arhan Jain
Abraham Lee
Youngwoon Lee
Marius Memmel
Sungjae Park
Ilija Radosavovic
Kaiyuan Wang
Kevin Black
Cheng Chi
Kyle Beltran Hatch
Shan Lin
Jingpei Lu
Jean Mercat
Abdul Rehman
Pannag R Sanketi
Cody Simpson
Quan Vuong
Homer Rich Walke
Blake Wulfe
Ted Xiao
Jonathan Heewon Yang
Arefeh Yavary
Tony Z. Zhao
Christopher Agia
Rohan Baijal
Mateo Guaman Castro
Daphne Chen
Qiuyu Chen
Trinity Chung
Jaimyn Drake
Ethan Paul Foster
Jensen Gao
David Antonio Herrera
Minho Heo
Kyle Hsu
Jiaheng Hu
Muhammad Zubair Irshad
Donovon Jackson
Charlotte Le
Xinyu Lin
Yunshuang Li
K. Lin
Roy Lin
Zehan Ma
Abhiram Maddukuri
Suvir Mirchandani
Daniel Morton
Tony Khuong Nguyen
Abigail O'Neill
Rosario Scalise
Derick Seale
Victor Son
Stephen Tian
Emi Tran
Andrew E. Wang
Yilin Wu
Annie Xie
Jingyun Yang
Patrick Yin
Yunchu Zhang
Osbert Bastani
Jeannette Bohg
Ken Goldberg
Abhishek Gupta
Dinesh Jayaraman
Joseph J Lim
Jitendra Malik
Roberto Martín-Martín
Subramanian Ramamoorthy
Dorsa Sadigh
Shuran Song
Jiajun Wu
Michael C. Yip
Yuke Zhu
Thomas Kollar
Sergey Levine
Chelsea Finn
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and … (see more)robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
Siddharth Karamcheti
Soroush Nasiriany
Mohan Kumar Srirama
Lawrence Yunliang Chen
Peter David Fagan
Joey Hejna
Masha Itkina
Marion Lepert
Ye Ma
Patrick Tree Miller
Jimmy Wu
Suneel Belkhale
Shivin Dass
Huy Ha … (see 79 more)
Arhan Jain
Abraham Lee
Youngwoon Lee
Marius Memmel
Sungjae Park
Ilija Radosavovic
Kaiyuan Wang
Kevin Black
Cheng Chi
Kyle Beltran Hatch
Shan Lin
Jingpei Lu
Jean Mercat
Abdul Rehman
Pannag R Sanketi
Cody Simpson
Quan Vuong
Homer Rich Walke
Blake Wulfe
Ted Xiao
Jonathan Heewon Yang
Arefeh Yavary
Tony Z. Zhao
Christopher Agia
Rohan Baijal
Mateo Guaman Castro
Daphne Chen
Qiuyu Chen
Trinity Chung
Jaimyn Drake
Ethan Paul Foster
Jensen Gao
David Antonio Herrera
Minho Heo
Kyle Hsu
Jiaheng Hu
Donovon Jackson
Charlotte Le
Yunshuang Li
K. Lin
Roy Lin
Zehan Ma
Abhiram Maddukuri
Suvir Mirchandani
Daniel Morton
Tony Khuong Nguyen
Abigail O'Neill
Rosario Scalise
Derick Seale
Victor Son
Stephen Tian
Emi Tran
Andrew E. Wang
Yilin Wu
Annie Xie
Jingyun Yang
Patrick Yin
Yunchu Zhang
Osbert Bastani
Jeannette Bohg
Ken Goldberg
Abhishek Gupta
Dinesh Jayaraman
Joseph J Lim
Jitendra Malik
Roberto Mart'in-Mart'in
Subramanian Ramamoorthy
Dorsa Sadigh
Shuran Song
Jiajun Wu
Michael C. Yip
Yuke Zhu
Thomas Kollar
Sergey Levine
Chelsea Finn
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and … (see more)robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
Siddharth Karamcheti
Soroush Nasiriany
Mohan Kumar Srirama
Lawrence Yunliang Chen
Peter David Fagan
Joey Hejna
Masha Itkina
Marion Lepert
Ye Ma
Patrick Tree Miller
Jimmy Wu
Suneel Belkhale
Shivin Dass
Huy Ha … (see 79 more)
Arhan Jain
Abraham Lee
Youngwoon Lee
Marius Memmel
Sungjae Park
Ilija Radosavovic
Kaiyuan Wang
Kevin Black
Cheng Chi
Kyle Beltran Hatch
Shan Lin
Jingpei Lu
Jean Mercat
Abdul Rehman
Pannag R Sanketi
Cody Simpson
Quan Vuong
Homer Rich Walke
Blake Wulfe
Ted Xiao
Jonathan Heewon Yang
Arefeh Yavary
Tony Z. Zhao
Christopher Agia
Rohan Baijal
Mateo Guaman Castro
Daphne Chen
Qiuyu Chen
Trinity Chung
Jaimyn Drake
Ethan Paul Foster
Jensen Gao
David Antonio Herrera
Minho Heo
Kyle Hsu
Jiaheng Hu
Donovon Jackson
Charlotte Le
Yunshuang Li
K. Lin
Roy Lin
Zehan Ma
Abhiram Maddukuri
Suvir Mirchandani
Daniel Morton
Tony Khuong Nguyen
Abigail O'Neill
Rosario Scalise
Derick Seale
Victor Son
Stephen Tian
Emi Tran
Andrew E. Wang
Yilin Wu
Annie Xie
Jingyun Yang
Patrick Yin
Yunchu Zhang
Osbert Bastani
Jeannette Bohg
Ken Goldberg
Abhishek Gupta
Dinesh Jayaraman
Joseph J Lim
Jitendra Malik
Roberto Mart'in-Mart'in
Subramanian Ramamoorthy
Dorsa Sadigh
Shuran Song
Jiajun Wu
Michael C. Yip
Yuke Zhu
Thomas Kollar
Sergey Levine
Chelsea Finn
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and … (see more)robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
Siddharth Karamcheti
Soroush Nasiriany
Mohan Kumar Srirama
Lawrence Yunliang Chen
Peter David Fagan
Joey Hejna
Masha Itkina
Marion Lepert
Ye Ma
Patrick Tree Miller
Jimmy Wu
Suneel Belkhale
Shivin Dass
Huy Ha … (see 79 more)
Arhan Jain
Abraham Lee
Youngwoon Lee
Marius Memmel
Sungjae Park
Ilija Radosavovic
Kaiyuan Wang
Kevin Black
Cheng Chi
Kyle Beltran Hatch
Shan Lin
Jingpei Lu
Jean Mercat
Abdul Rehman
Pannag R Sanketi
Cody Simpson
Quan Vuong
Homer Rich Walke
Blake Wulfe
Ted Xiao
Jonathan Heewon Yang
Arefeh Yavary
Tony Z. Zhao
Christopher Agia
Rohan Baijal
Mateo Guaman Castro
Daphne Chen
Qiuyu Chen
Trinity Chung
Jaimyn Drake
Ethan Paul Foster
Jensen Gao
David Antonio Herrera
Minho Heo
Kyle Hsu
Jiaheng Hu
Donovon Jackson
Charlotte Le
Yunshuang Li
K. Lin
Roy Lin
Zehan Ma
Abhiram Maddukuri
Suvir Mirchandani
Daniel Morton
Tony Khuong Nguyen
Abigail O'Neill
Rosario Scalise
Derick Seale
Victor Son
Stephen Tian
Emi Tran
Andrew E. Wang
Yilin Wu
Annie Xie
Jingyun Yang
Patrick Yin
Yunchu Zhang
Osbert Bastani
Jeannette Bohg
Ken Goldberg
Abhishek Gupta
Dinesh Jayaraman
Joseph J Lim
Jitendra Malik
Roberto Mart'in-Mart'in
Subramanian Ramamoorthy
Dorsa Sadigh
Shuran Song
Jiajun Wu
Michael C. Yip
Yuke Zhu
Thomas Kollar
Sergey Levine
Chelsea Finn
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and … (see more)robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
Siddharth Karamcheti
Soroush Nasiriany
Mohan Kumar Srirama
Lawrence Yunliang Chen
Peter David Fagan
Joey Hejna
Masha Itkina
Marion Lepert
Ye Ma
Patrick Tree Miller
Jimmy Wu
Suneel Belkhale
Shivin Dass
Huy Ha … (see 79 more)
Arhan Jain
Abraham Lee
Youngwoon Lee
Marius Memmel
Sungjae Park
Ilija Radosavovic
Kaiyuan Wang
Kevin Black
Cheng Chi
Kyle Beltran Hatch
Shan Lin
Jingpei Lu
Jean Mercat
Abdul Rehman
Pannag R Sanketi
Cody Simpson
Quan Vuong
Homer Rich Walke
Blake Wulfe
Ted Xiao
Jonathan Heewon Yang
Arefeh Yavary
Tony Z. Zhao
Christopher Agia
Rohan Baijal
Mateo Guaman Castro
Daphne Chen
Qiuyu Chen
Trinity Chung
Jaimyn Drake
Ethan Paul Foster
Jensen Gao
David Antonio Herrera
Minho Heo
Kyle Hsu
Jiaheng Hu
Donovon Jackson
Charlotte Le
Yunshuang Li
K. Lin
Roy Lin
Zehan Ma
Abhiram Maddukuri
Suvir Mirchandani
Daniel Morton
Tony Khuong Nguyen
Abigail O'Neill
Rosario Scalise
Derick Seale
Victor Son
Stephen Tian
Emi Tran
Andrew E. Wang
Yilin Wu
Annie Xie
Jingyun Yang
Patrick Yin
Yunchu Zhang
Osbert Bastani
Jeannette Bohg
Ken Goldberg
Abhishek Gupta
Dinesh Jayaraman
Joseph J Lim
Jitendra Malik
Roberto Mart'in-Mart'in
Subramanian Ramamoorthy
Dorsa Sadigh
Shuran Song
Jiajun Wu
Michael C. Yip
Yuke Zhu
Thomas Kollar
Sergey Levine
Chelsea Finn
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and … (see more)robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.
ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Qiuyuan Huang
J. Park
Pan Lu
Paul N. Bennett
Ran Gong
Subhojit Som
Baolin Peng
Owais Khan Mohammed
Yejin Choi
Jianfeng Gao
Despite the growing adoption of mixed reality and interactive AI agents, it remains challenging for these systems to generate high quality 2… (see more)D/3D scenes in unseen environments. The common practice requires deploying an AI agent to collect large amounts of data for model training for every new task. This process is costly, or even impossible, for many domains. In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e.g. GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in the physical or virtual world. The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK), which leverages knowledge-memory to generate scenes in unseen physical world and virtual reality environments. The knowledge interactive emergent ability (Figure 1) is demonstrated as the observation learns i) micro-action of cross-modality: in multi-modality models to collect a large amount of relevant knowledge memory data for each interaction task (e.g., unseen scene understanding) from the physical reality; and ii) macro-behavior of reality-agnostic: in mix-reality environments to improve interactions that tailor to different characterized roles, target variables, collaborative information, and so on. We validate the effectiveness of ArK on the scene generation and editing tasks. We show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes, compared to baselines, demonstrating the potential benefit of incorporating ArK in generative AI for applications such as metaverse and gaming simulation.
Learning Multi-Objective Curricula for Robotic Policy Learning
ArK: Augmented Reality with Knowledge Emergent Infrastructure
Qiuyuan Huang
J. Park
Pan Lu
Paul N. Bennett
Ran Gong
Subhojit Som
Baolin Peng
Owais Khan Mohammed
Yejin Choi
Jianfeng Gao
Despite the growing adoption of mixed reality and interactive AI, it remains challenging to generate high-quality 2D/3D scenes in unseen env… (see more)ironments. Typically, an AI agent requires collecting extensive training data for every new task, which can be costly or impossible for many domains. In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e.g., GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in physical or virtual worlds. Central to our approach is the interactive emerging mechanism, dubbed Augmented Reality with Knowledge Emergent Infrastructure (ArK) , which leverages knowledge-memory to generate scenes in unseen physical worlds and virtual reality environments. The knowledge interactive emergent ability (Figure 1) is demonstrated through i) micro-action of cross-modality : in multi-modality models to collect a large amount of relevant knowledge-memory data for each interaction task (e.g., unseen scene understanding) from the physical reality; and ii) macro-behavior of reality-agnostic : in mix-reality environments to improve interactions that tailor to different characterized roles, target variables, collaborative information, and so on. We validate ArK’s effectiveness in scene generation and editing tasks and show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes, highlighting its potential in applications such as metaverse and gaming simulation.
Structural Inductive Biases in Emergent Communication
Agnieszka M Slowik
William L. Hamilton
M. Jamnik
S. Holden
In order to communicate, humans flatten a complex representation of ideas and their attributes into a single word or a sentence. We investig… (see more)ate the impact of representation learning in artificial agents by developing graph referential games. We empirically show that agents parametrized by graph neural networks develop a more compositional language compared to bag-of-words and sequence models, which allows them to systematically generalize to new combinations of familiar features.
Exploring Structural Inductive Biases in Emergent Communication
Agnieszka M Slowik
William L. Hamilton
M. Jamnik
S. Holden
Human language and thought are characterized by the ability to systematically generate a potentially infinite number of complex structures (… (see more)e.g., sentences) from a finite set of familiar components (e.g., words). Recent works in emergent communication have discussed the propensity of artificial agents to develop a systematically compositional language through playing co-operative referential games. The degree of structure in the input data was found to affect the compositionality of the emerged communication protocols. Thus, we explore various structural priors in multi-agent communication and propose a novel graph referential game. We compare the effect of structural inductive bias (bag-of-words, sequences and graphs) on the emergence of compositional understanding of the input concepts measured by topographic similarity and generalization to unseen combinations of familiar properties. We empirically show that graph neural networks induce a better compositional language prior and a stronger generalization to out-of-domain data. We further perform ablation studies that show the robustness of the emerged protocol in graph referential games.