Publications

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
Siddharth Karamcheti
Soroush Nasiriany
Mohan Kumar Srirama
Lawrence Yunliang Chen
Kirsty Ellis
Peter David Fagan
Joey Hejna
Masha Itkina
Marie Lepert
Ye Ma
Patrick Tree Miller
Jimmy Wu
Suneel Belkhale
S. Dass
Huy Ha … (voir 79 de plus)
Arhan Jain
Abraham Lee
Youngwoon Lee
Marius Memmel
S. Park
Ilija Radosavovic
Kaiyuan Wang
Albert Zhan
Kevin Black
Cheng Chi
Kyle Beltran Hatch
Shan Lin
Jingpei Lu
Jean-Pierre Mercat
Abdul Rehman
Pannag R. Sanketi
Archit Sharma
C. Simpson
Q. Vương
Homer Rich Walke
Blake Wulfe
Ted Xiao
Jonathan Heewon Yang
Arefeh Yavary
Tony Z. Zhao
Christopher Agia
Rohan Baijal
Mateo Guaman Castro
D. Chen
Qiuyu Chen
Trinity Chung
Jaimyn Drake
Ethan Paul Foster
Jensen Gao
David Antonio Herrera
Minho Heo
Kyle Hsu
Jiaheng Hu
Donovon Jackson
Charlotte Le
Yunshuang Li
K. Lin
Roy Lin
Zehan Ma
Abhiram Maddukuri
Suvir Mirchandani
D. Morton
Tony Nguyen
Abigail O'Neill
R. Scalise
Derick Seale
Victor Son
Stephen Tian
Emi Tran
Andrew E. Wang
Yilin Wu
Annie Xie
Jingyun Yang
Patrick Yin
Yunchu Zhang
Osbert Bastani
Jeannette Bohg
Ken Goldberg
Abhinav Gupta
Abhishek Gupta
Dinesh Jayaraman
Joseph J. Lim
Jitendra Malik
Roberto Mart'in-Mart'in
Subramanian Ramamoorthy
Dorsa Sadigh
Shuran Song
Jiajun Wu
Michael C. Yip
Yuke Zhu
Thomas Kollar
Sergey Levine
Chelsea Finn
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and … (voir plus)robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.
Listenable Maps for Audio Classifiers
Solving Combinatorial Pricing Problems using Embedded Dynamic Programming Models
Quang Minh Bui
Jos'e Neto
The combinatorial pricing problem (CPP) is a bilevel problem in which the leader maximizes their revenue by imposing tolls on certain items … (voir plus)that they can control. Based on the tolls set by the leader, the follower selects a subset of items corresponding to an optimal solution of a combinatorial optimization problem. To accomplish the leader's goal, the tolls need to be sufficiently low to discourage the follower from choosing the items offered by the competitors. In this paper, we derive a single-level reformulation for the CPP by rewriting the follower's problem as a longest path problem using a dynamic programming model, and then taking its dual and applying strong duality. We proceed to solve the reformulation in a dynamic fashion with a cutting plane method. We apply this methodology to 2 distinct dynamic programming models, namely, a novel formulation designated as selection diagram and the well-known decision diagram. We also produce numerical results to evaluate their performances across 3 different specializations of the CPP and a closely related problem that is the knapsack interdiction problem. Our results showcase the potential of the 2 proposed reformulations over the natural value function approach, expanding the set of tools to solve combinatorial bilevel programs.
Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection
Ali Karami
Thi Kieu Khanh Ho
Offline Multitask Representation Learning for Reinforcement Learning
Haque Ishfaq
Thanh Nguyen-Tang
Songtao Feng
Raman Arora
Mengdi Wang
Ming Yin
A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models
Stephen R. Pfohl
Heather Cole-Lewis
Rory A Sayres
Darlene Neal
Mercy Nyamewaa Asiedu
Awa Dieng
Nenad Tomašev
Qazi Mamunur Rashid
Shekoofeh Azizi
Liam G. McCoy
L. A. Celi
Yun Liu
Mike Schaekermann
Alanna Walton
Alicia Parrish
Chirag Nagpal
Preeti Singh
Akeiylah Dewitt
P. A. Mansfield … (voir 10 de plus)
Sushant Prakash
Katherine Heller
Alan Karthikesalingam
Christopher Semturs
Joelle Barral
Greg C. Corrado
Yossi Matias
Jamila Smith-Loud
Ivor Horn
Karan Singhal
Reinforcement learning for freight booking control problems
Justin Dumouchelle
SelfIE: Self-Interpretation of Large Language Model Embeddings
Haozhe Chen
Carl Vondrick
How do large language models (LLMs) obtain their answers? The ability to explain and control an LLM's reasoning process is key for reliabili… (voir plus)ty, transparency, and future model developments. We propose SelfIE (Self-Interpretation of Embeddings), a framework that enables LLMs to interpret their own embeddings in natural language by leveraging their ability to respond to inquiries about a given passage. Capable of interpreting open-world concepts in the hidden embeddings, SelfIE reveals LLM internal reasoning in cases such as making ethical decisions, internalizing prompt injection, and recalling harmful knowledge. SelfIE's text descriptions on hidden embeddings also open up new avenues to control LLM reasoning. We propose Supervised Control, which allows editing open-ended concepts while only requiring gradient computation of individual layer. We extend RLHF to hidden embeddings and propose Reinforcement Control that erases harmful knowledge in LLM without supervision targets.
Normalizing Spinal Cord Compression Morphometric Measures: Application in Degenerative Cervical Myelopathy
Sandrine Bédard
Jan Valošek
Maryam Seif PhD
Armin Curt PhD
Simon Schading Md
M.Sc
Nikolai Pfender
Patrick Freund Md
Markus Hupp MD PhD
Julien Cohen-adad Md
Objective: Automatic and robust characterization of spinal cord shape from MRI images is relevant to assess the severity of spinal cord comp… (voir plus)ression in degenerative cervical myelopathy (DCM) and to guide therapeutic strategy. Despite its popularity, the maximum spinal cord compression (MSCC) index has practical limitations to objectively assess the severity of cord compression. Firstly, it is computed by normalizing the anteroposterior cord diameter by that above and below the level of compression, but it does not account for the fact that the spinal cord itself varies in size along the superior-inferior axis, making this MSCC sensitive to the level of compression. Secondly, spinal cord shape varies across individuals, making MSCC also sensitive to the size and shape of every individual. Thirdly, MSCC is typically computed by the expert-rater on a single sagittal slice, which is time-consuming and prone to inter-rater variability. In this study, we propose a fully automatic pipeline to compute MSCC. Methods: We extended the traditional MSCC (based on the anteroposterior diameter) to other shape metrics (transverse diameter, area, eccentricity, and solidity), and proposed a normalization strategy using a database of healthy adults (n=203) to address the variability of the spinal cord anatomy between individuals. We validated the proposed method in a cohort of DCM patients (n=120) with manually derived morphometric measures and predicted the therapeutic decision (operative/conservative) using a stepwise binary logistic regression including demographics, clinical scores, and electrophysiological assessment. Results: The automatic and normalized MSCC measures significantly correlated with clinical scores and predicted the therapeutic decision with higher accuracy than the manual MSCC. Results show that the sensory dysfunction of the upper extremities (mJOA subscore), the presence of myelopathy and the proposed MRI-based normalized morphometric measures were significant predictors of the therapeutic decision. The model yielded an area under the curve of the receiver operating characteristic of 80%. Conclusion: The study introduced an automatic method for computation of normalized MSCC measures of cord compression from MRI scans, which is an important step towards better informed therapeutic decisions in DCM patients. The method is open-source and available in the Spinal Cord Toolbox v6.0.
Safety Cases: How to Justify the Safety of Advanced AI Systems
Joshua Clymer
Nick Gabrieli
Thomas Larsen
As AI systems become more advanced, companies and regulators will make difficult decisions about whether it is safe to train and deploy them… (voir plus). To prepare for these decisions, we investigate how developers could make a 'safety case,' which is a structured rationale that AI systems are unlikely to cause a catastrophe. We propose a framework for organizing a safety case and discuss four categories of arguments to justify safety: total inability to cause a catastrophe, sufficiently strong control measures, trustworthiness despite capability to cause harm, and -- if AI systems become much more powerful -- deference to credible AI advisors. We evaluate concrete examples of arguments in each category and outline how arguments could be combined to justify that AI systems are safe to deploy.
Aleatoric and epistemic uncertainty extraction of patient-specific deep learning-based dose predictions in LDR prostate brachytherapy
Francisco Berumen
Samuel Ouellet
Luc Beaulieu
Analyzing Data Augmentation for Medical Images: A Case Study in Ultrasound Images
Adam Tupper
Data augmentation is one of the most effective techniques to improve the generalization performance of deep neural networks. Yet, despite of… (voir plus)ten facing limited data availability in medical image analysis, it is frequently underutilized. This appears to be due to a gap in our collective understanding of the efficacy of different augmentation techniques across medical imaging tasks and modalities. One domain where this is especially true is breast ultrasound images. This work addresses this issue by analyzing the effectiveness of different augmentation techniques for the classification of breast lesions in ultrasound images. We assess the generalizability of our findings across several datasets, demonstrate that certain augmentations are far more effective than others, and show that their usage leads to significant performance gains.