Portrait of Michael Przystupa

Michael Przystupa

Research Intern - Université de Montréal
Supervisor
Research Topics
Deep Learning
Reinforcement Learning

Publications

Efficient Morphology-Aware Policy Transfer to New Embodiments
Hongyao Tang
Mariano Phielipp
Santiago Miret
Martin Jägersand
Matthew E. Taylor
Morphology-aware policy learning is a means of enhancing policy sample efficiency by aggregating data from multiple agents. These types of p… (see more)olicies have previously been shown to help generalize over dynamic, kinematic, and limb configuration variations between agent morphologies. Unfortunately, these policies still have sub-optimal zero-shot performance compared to end-to-end finetuning on morphologies at deployment. This limitation has ramifications in practical applications such as robotics because further data collection to perform end-to-end finetuning can be computationally expensive. In this work, we investigate combining morphology-aware pretraining with \textit{parameter efficient finetuning} (PEFT) techniques to help reduce the learnable parameters necessary to specialize a morphology-aware policy to a target embodiment. We compare directly tuning sub-sets of model weights, input learnable adapters, and prefix tuning techniques for online finetuning. Our analysis reveals that PEFT techniques in conjunction with policy pre-training generally help reduce the number of samples to necessary to improve a policy compared to training models end-to-end from scratch. We further find that tuning as few as less than 1\% of total parameters will improve policy performance compared the zero-shot performance of the base pretrained a policy.
Efficient Morphology-Aware Policy Transfer to New Embodiments
Hongyao Tang
Mariano Phielipp
Santiago Miret
Martin Jägersand
Matthew E. Taylor
Morphology-aware policy learning is a means of enhancing policy sample efficiency by aggregating data from multiple agents. These types of p… (see more)olicies have previously been shown to help generalize over dynamic, kinematic, and limb configuration variations between agent morphologies. Unfortunately, these policies still have sub-optimal zero-shot performance compared to end-to-end finetuning on morphologies at deployment. This limitation has ramifications in practical applications such as robotics because further data collection to perform end-to-end finetuning can be computationally expensive. In this work, we investigate combining morphology-aware pretraining with \textit{parameter efficient finetuning} (PEFT) techniques to help reduce the learnable parameters necessary to specialize a morphology-aware policy to a target embodiment. We compare directly tuning sub-sets of model weights, input learnable adapters, and prefix tuning techniques for online finetuning. Our analysis reveals that PEFT techniques in conjunction with policy pre-training generally help reduce the number of samples to necessary to improve a policy compared to training models end-to-end from scratch. We further find that tuning as few as less than 1\% of total parameters will improve policy performance compared the zero-shot performance of the base pretrained a policy.
Investigating the Benefits of Nonlinear Action Maps in Data-Driven Teleoperation
Matthew E. Taylor
Martin Jägersand
Justus Piater
Samuele Tosatto
As robots become more common for both able-bodied individuals and those living with a disability, it is increasingly important that lay peop… (see more)le be able to drive multi-degree-of-freedom platforms with low-dimensional controllers. One approach is to use state-conditioned action mapping methods to learn mappings between low-dimensional controllers and high DOF manipulators -- prior research suggests these mappings can simplify the teleoperation experience for users. Recent works suggest that neural networks predicting a local linear function are superior to the typical end-to-end multi-layer perceptrons because they allow users to more easily undo actions, providing more control over the system. However, local linear models assume actions exist on a linear subspace and may not capture nuanced actions in training data. We observe that the benefit of these mappings is being an odd function concerning user actions, and propose end-to-end nonlinear action maps which achieve this property. Unfortunately, our experiments show that such modifications offer minimal advantages over previous solutions. We find that nonlinear odd functions behave linearly for most of the control space, suggesting architecture structure improvements are not the primary factor in data-driven teleoperation. Our results suggest other avenues, such as data augmentation techniques and analysis of human behavior, are necessary for action maps to become practical in real-world applications, such as in assistive robotics to improve the quality of life of people living with w disability.
Minimally Invasive Morphology Adaptation via Parameter Efficient Fine-Tuning
Hongyao Tang
Mariano Phielipp
Santiago Miret
Martin Jägersand
Learning reinforcement learning policies to control individual robots is often computationally non-economical because minor variations in ro… (see more)bot morphology (e.g. dynamics or number of limbs) can negatively impact policy performance. This limitation has motivated morphology agnostic policy learning, in which a monolithic deep learning policy learns to generalize between robotic morphologies. Unfortunately, these policies still have sub-optimal zero-shot performance compared to end-to-end finetuning on target morphologies. This limitation has ramifications in practical robotic applications, as online finetuning large neural networks can require immense computation. In this work, we investigate \textit{parameter efficient finetuning} techniques to specialize morphology-agnostic policies to a target robot that minimizes the number of learnable parameters adapted during online learning. We compare direct finetuning, which update subsets of the base model parameters, and input-learnable approaches, which add additional parameters to manipulate inputs passed to the base model. Our analysis concludes that tuning relatively few parameters (0.01\% of the base model) can measurably improve policy performance over zero shot. These results serve a prescriptive purpose for future research for which scenarios certain PEFT approaches are best suited for adapting policy's to new robotic morphologies.
Local Linearity is All You Need (in Data-Driven Teleoperation)
Matthew E. Taylor
Martin Jägersand
Justus Piater
Samuele Tosatto
One of the critical aspects of assistive robotics is to provide a control system of a high-dimensional robot from a low-dimensional user inp… (see more)ut (i.e. a 2D joystick). Data-driven teleoperation seeks to provide an intuitive user interface called an action map to map the low dimensional input to robot velocities from human demonstrations. Action maps are machine learning models trained on robotic demonstration data to map user input directly to desired movements as opposed to aspects of robot pose ("move to cup or pour content" vs. "move along x- or y-axis"). Many works have investigated nonlinear action maps with multi-layer perceptrons, but recent work suggests that local-linear neural approximations provide better control of the system. However, local linear models assume actions exist on a linear subspace and may not capture nuanced motions in training data. In this work, we hypothesize that local-linear neural networks are effective because they make the action map odd w.r.t. the user input, enhancing the intuitiveness of the controller. Based on this assumption, we propose two nonlinear means of encoding odd behavior that do not constrain the action map to a local linear function. However, our analysis reveals that these models effectively behave like local linear models for relevant mappings between user joysticks and robot movements. We support this claim in simulation, and show on a realworld use case that there is no statistical benefit of using non-linear maps, according to the users experience. These negative results suggest that further investigation into model architectures beyond local linear models may offer diminishing returns for improving user experience in data-driven teleoperation systems.
Local Linearity is All You Need (in Data-Driven Teleoperation)
Matthew E. Taylor
Martin Jägersand
Justus Piater
Samuele Tosatto
One of the critical aspects of assistive robotics is to provide a control system of a high-dimensional robot from a low-dimensional user inp… (see more)ut (i.e. a 2D joystick). Data-driven teleoperation seeks to provide an intuitive user interface called an action map to map the low dimensional input to robot velocities from human demonstrations. Action maps are machine learning models trained on robotic demonstration data to map user input directly to desired movements as opposed to aspects of robot pose ("move to cup or pour content" vs. "move along x- or y-axis"). Many works have investigated nonlinear action maps with multi-layer perceptrons, but recent work suggests that local-linear neural approximations provide better control of the system. However, local linear models assume actions exist on a linear subspace and may not capture nuanced motions in training data. In this work, we hypothesize that local-linear neural networks are effective because they make the action map odd w.r.t. the user input, enhancing the intuitiveness of the controller. Based on this assumption, we propose two nonlinear means of encoding odd behavior that do not constrain the action map to a local linear function. However, our analysis reveals that these models effectively behave like local linear models for relevant mappings between user joysticks and robot movements. We support this claim in simulation, and show on a realworld use case that there is no statistical benefit of using non-linear maps, according to the users experience. These negative results suggest that further investigation into model architectures beyond local linear models may offer diminishing returns for improving user experience in data-driven teleoperation systems.
Local Linearity is All You Need (in Data-Driven Teleoperation)
Matthew E. Taylor
Martin Jägersand
Justus Piater
Samuele Tosatto
One of the critical aspects of assistive robotics is to provide a control system of a high-dimensional robot from a low-dimensional user inp… (see more)ut (i.e. a 2D joystick). Data-driven teleoperation seeks to provide an intuitive user interface called an action map to map the low dimensional input to robot velocities from human demonstrations. Action maps are machine learning models trained on robotic demonstration data to map user input directly to desired movements as opposed to aspects of robot pose ("move to cup or pour content" vs. "move along x- or y-axis"). Many works have investigated nonlinear action maps with multi-layer perceptrons, but recent work suggests that local-linear neural approximations provide better control of the system. However, local linear models assume actions exist on a linear subspace and may not capture nuanced motions in training data. In this work, we hypothesize that local-linear neural networks are effective because they make the action map odd w.r.t. the user input, enhancing the intuitiveness of the controller. Based on this assumption, we propose two nonlinear means of encoding odd behavior that do not constrain the action map to a local linear function. However, our analysis reveals that these models effectively behave like local linear models for relevant mappings between user joysticks and robot movements. We support this claim in simulation, and show on a realworld use case that there is no statistical benefit of using non-linear maps, according to the users experience. These negative results suggest that further investigation into model architectures beyond local linear models may offer diminishing returns for improving user experience in data-driven teleoperation systems.
Efficient Design-and-Control Automation with Reinforcement Learning and Adaptive Exploration
Jiajun Fan
Hongyao Tang
Mariano Phielipp
Santiago Miret
Seeking good designs is a central goal of many important domains, such as robotics, integrated circuits (IC), medicine, and materials scienc… (see more)e. These design problems are expensive, time-consuming, and traditionally performed by human experts. Moreover, the barriers to domain knowledge make it challenging to propose a universal solution that generalizes to different design problems. In this paper, we propose a new method called Efficient Design and Stable Control (EDiSon) for automatic design and control in different design problems. The key ideas of our method are (1) interactive sequential modeling of the design and control process and (2) adaptive exploration and design replay. To decompose the difficulty of learning design and control as a whole, we leverage sequential modeling for both the design process and control process, with a design policy to generate step-by-step design proposals and a control policy to optimize the objective by operating the design. With deep reinforcement learning (RL), the policies learn to find good designs by maximizing a reward signal that evaluates the quality of designs. Furthermore, we propose an adaptive exploration and replay mechanism based on a design memory that maintains high-quality designs generated so far. By regulating between constructing a design from scratch or replaying a design from memory to refine it, EDiSon balances the trade-off between exploration and exploitation in the design space and stabilizes the learning of the control policy. In the experiments, we evaluate our method in robotic morphology design and Tetris-based design tasks. Our framework has the potential to significantly accelerate the discovery of optimized designs across diverse domains, including automated materials discovery, by improving the exploration in design space while ensuring efficiency.
Efficient Design-and-Control Automation with Reinforcement Learning and Adaptive Exploration
Jiajun Fan
Hongyao Tang
Mariano Phielipp
Santiago Miret
Seeking good designs is a central goal of many important domains, such as robotics, integrated circuits (IC), medicine, and materials scienc… (see more)e. These design problems are expensive, time-consuming, and traditionally performed by human experts. Moreover, the barriers to domain knowledge make it challenging to propose a universal solution that generalizes to different design problems. In this paper, we propose a new method called Efficient Design and Stable Control (EDiSon) for automatic design and control in different design problems. The key ideas of our method are (1) interactive sequential modeling of the design and control process and (2) adaptive exploration and design replay. To decompose the difficulty of learning design and control as a whole, we leverage sequential modeling for both the design process and control process, with a design policy to generate step-by-step design proposals and a control policy to optimize the objective by operating the design. With deep reinforcement learning (RL), the policies learn to find good designs by maximizing a reward signal that evaluates the quality of designs. Furthermore, we propose an adaptive exploration and replay mechanism based on a design memory that maintains high-quality designs generated so far. By regulating between constructing a design from scratch or replaying a design from memory to refine it, EDiSon balances the trade-off between exploration and exploitation in the design space and stabilizes the learning of the control policy. In the experiments, we evaluate our method in robotic morphology design and Tetris-based design tasks. Our framework has the potential to significantly accelerate the discovery of optimized designs across diverse domains, including automated materials discovery, by improving the exploration in design space while ensuring efficiency.