Working Backwards: Learning to Place by Picking
Oliver Limoyo
Abhisek Konar
Trevor Ablett
Jonathan Kelly
Francois Hogan
We present placing via picking (PvP), a method to autonomously collect real-world demonstrations for a family of placing tasks in which obje… (voir plus)cts must be manipulated to specific, contact-constrained locations. With PvP, we approach the collection of robotic object placement demonstrations by reversing the grasping process and exploiting the inherent symmetry of the pick and place problems. Specifically, we obtain placing demonstrations from a set of grasp sequences of objects initially located at their target placement locations. Our system can collect hundreds of demonstrations in contact-constrained environments without human intervention using two modules: compliant control for grasping and tactile regrasping. We train a policy directly from visual observations through behavioural cloning, using the autonomously-collected demonstrations. By doing so, the policy can generalize to object placement scenarios outside of the training environment without privileged information (e.g., placing a plate picked up from a table). We validate our approach in home robot scenarios that include dishwasher loading and table setting. Our approach yields robotic placing policies that outperform policies trained with kinesthetic teaching, both in terms of success rate and data efficiency, while requiring no human supervision.
Working Backwards: Learning to Place by Picking
Oliver Limoyo
Abhisek Konar
Trevor Ablett
Jonathan Kelly
Francois Hogan
We present placing via picking (PvP), a method to autonomously collect real-world demonstrations for a family of placing tasks in which obje… (voir plus)cts must be manipulated to specific, contact-constrained locations. With PvP, we approach the collection of robotic object placement demonstrations by reversing the grasping process and exploiting the inherent symmetry of the pick and place problems. Specifically, we obtain placing demonstrations from a set of grasp sequences of objects initially located at their target placement locations. Our system can collect hundreds of demonstrations in contact-constrained environments without human intervention using two modules: compliant control for grasping and tactile regrasping. We train a policy directly from visual observations through behavioural cloning, using the autonomously-collected demonstrations. By doing so, the policy can generalize to object placement scenarios outside of the training environment without privileged information (e.g., placing a plate picked up from a table). We validate our approach in home robot scenarios that include dishwasher loading and table setting. Our approach yields robotic placing policies that outperform policies trained with kinesthetic teaching, both in terms of success rate and data efficiency, while requiring no human supervision.
Decision Diagrams in Space!
Isaac Rudich
Manuel L'opez-Ib'anez
Michael Romer
Louis-Martin Rousseau
An Exact Framework for Solving the Space-Time Dependent TSP
Isaac Rudich
Manuel L'opez-Ib'anez
Michael Romer
Louis-Martin Rousseau
Many real-world scenarios involve solving bi-level optimization problems in which there is an outer discrete optimization problem, and an in… (voir plus)ner problem involving expensive or black-box computation. This arises in space-time dependent variants of the Traveling Salesman Problem, such as when planning space missions that visit multiple astronomical objects. Planning these missions presents significant challenges due to the constant relative motion of the objects involved. There is an outer combinatorial problem of finding the optimal order to visit the objects and an inner optimization problem that requires finding the optimal departure time and trajectory to travel between each pair of objects. The constant motion of the objects complicates the inner problem, making it computationally expensive. This paper introduces a novel framework utilizing decision diagrams (DDs) and a DD-based branch-and-bound technique, Peel-and-Bound, to achieve exact solutions for such bi-level optimization problems, assuming sufficient inner problem optimizer quality. The framework leverages problem-specific knowledge to expedite search processes and minimize the number of expensive evaluations required. As a case study, we apply this framework to the Asteroid Routing Problem (ARP), a benchmark problem in global trajectory optimization. Experimental results demonstrate the framework's scalability and ability to generate robust heuristic solutions for ARP instances. Many of these solutions are exact, contingent on the assumed quality of the inner problem's optimizer.
An Exact Framework for Solving the Space-Time Dependent TSP
Isaac Rudich
Manuel L'opez-Ib'anez
Michael Romer
Louis-Martin Rousseau
Can We Learn Communication-Efficient Optimizers?
Charles-Étienne Joseph
Benjamin Thérien
Abhinav Moudgil
Boris Knyazev
Advancing Clinical Psychiatry: Integration of Clinical and Omics Data Using Machine Learning
Bill Qi
Automatic Head and Neck Tumor segmentation and outcome prediction relying on FDG-PET/CT images: Findings from the second edition of the HECKTOR challenge
Vincent Andrearczyk
Valentin Oreiller
Sarah Boughdad
Catherine Cheze Le Rest
Olena Tankyevych
Hesham M. Elhalawani
Mario Jreige
John O. Prior
Dimitris Visvikis
Mathieu Hatt
Adrien Depeursinge
Balaur: Language Model Pretraining with Lexical Semantic Relations
Andrei Mircea
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Avi Singh
John D Co-Reyes
Ankesh Anand
Piyush Patil
Xavier Garcia
Peter J. Liu
James Harrison
Jaehoon Lee
Kelvin Xu
Aaron T Parisi
Abhishek Kumar
A. Alemi
Alex Rizkowsky
Azade Nova
Ben Adlam
Bernd Bohnet
Hanie Sedghi
Gamaleldin Fathy Elsayed
Igor Mordatch … (voir 21 de plus)
Isabelle Simpson
Izzeddin Gur
Jasper Snoek
Jeffrey Pennington
Jiri Hron
Kathleen Kenealy
Kevin Swersky
Kshiteej Mahajan
Laura Culp
Lechao Xiao
Maxwell Bileschi
Noah Constant
Roman Novak
Rosanne Liu
Tris Brian Warkentin
Yundi Qian
Ethan Dyer
Behnam Neyshabur
Jascha Sohl-Dickstein
Yamini Bansal
Noah Fiedel
Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often lim… (voir plus)ited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST
Brain decoding of the Human Connectome Project tasks in a dense individual fMRI dataset
Shima Rastegarnia
Marie St-Laurent
Elizabeth DuPre
Basile Pinsard
Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model
Parishad BehnamGhader
Santiago Miret
Augmenting pretrained language models with retrievers to select the supporting documents has shown promise in effectively solving common NLP… (voir plus) problems, including language modeling and question answering, in an interpretable way. In this paper, we first study the strengths and weaknesses of different retriever-augmented language models (REALM,