David Meger

Parseval Regularization for Continual Reinforcement Learning

Loss of plasticity, trainability loss, and primacy bias have been identified as issues arising when training deep neural networks on sequenc… (see more)es of tasks -- all referring to the increased difficulty in training on new tasks. We propose to use Parseval regularization, which maintains orthogonality of weight matrices, to preserve useful optimization properties and improve training in a continual reinforcement learning setting. We show that it provides significant benefits to RL agents on a suite of gridworld, CARL and MetaWorld tasks. We conduct comprehensive ablations to identify the source of its benefits and investigate the effect of certain metrics associated to network trainability including weight matrix rank, weight norms and policy entropy.

2024-12-10

ArXiv (preprint)

Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning

Harley Wiltzer

Marc Gendron-Bellemare

Patrick Shafto

Yash Jhaveri

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Parseval Regularization for Continual Reinforcement Learning

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Shedding Light on Large Generative Networks: Estimating Epistemic Uncertainty in Diffusion Models

Lucas Berry

Axel Brando

Generative diffusion models, notable for their large parameter count (exceeding 100 million) and operation within high-dimensional image spa… (see more)ces, pose significant challenges for traditional uncertainty estimation methods due to computational demands. In this work, we introduce an innovative framework, Diffusion Ensembles for Capturing Uncertainty (DECU), designed for estimating epistemic uncertainty for diffusion models. The DECU framework introduces a novel method that efficiently trains ensembles of conditional diffusion models by incorporating a static set of pre-trained parameters, drastically reducing the computational burden and the number of parameters that require training. Additionally, DECU employs Pairwise-Distance Estimators (PaiDEs) to accurately measure epistemic uncertainty by evaluating the mutual information between model outputs and weights in high-dimensional spaces. The effectiveness of this framework is demonstrated through experiments on the ImageNet dataset, highlighting its capability to capture epistemic uncertainty, specifically in under-sampled image classes.

2024-09-12

Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence (published)

Programmable Shape‐Preserving Soft Robotics Arm via Multimodal Multistability

Benyamin Shahryari

Hossein Mofatteh

Arian Sargazi

Armin Mirabolghasemi

Abdolhamid Akbarzadeh

Inflatable multistable materials have significantly advanced the design of shape‐preserving soft robotic arms, offering substantial benefi… (see more)ts in terms of shape adaptability, energy efficiency, and safety, ensuring operational reliability even in the event of sudden power loss. However, existing strategies for realizing multistable arms often limit themselves to a single mode of multistability, commonly with rotationally symmetric designs favoring extension stability and asymmetric designs inducing bending stability. To address the limitation, this study introduces a pioneering platform termed multimodal multistability that utilizes geometrical frustration. A single cylindrical symmetric cell, designed for extension bistability, could achieve frustrated multistable states in bending by controlling the cell with multiple degrees of freedom incorporated pneumatic actuator. This platform extends the spectrum of attainable stable trajectories while preserving essential attributes of arms, such as load‐bearability, programmability, and reversibility of shape changes. Leveraging a pneumatic system with four degrees of freedom for pressure control, not only enables capturing previously unexplored stable configurations in mechanical metastructures but also allows for the control of their deformation modes. With applications spanning space exploration, medical instruments, and rescue missions, the multimodal multistability promises unparalleled flexibility and efficiency in the design and operation of soft robots.

2024-08-29

Advanced Functional Materials (published)

Shedding Light on Large Generative Networks: Estimating Epistemic Uncertainty in Diffusion Models

Lucas Berry

Axel Brando

Generative diffusion models, notable for their large parameter count (exceeding 100 million) and operation within high-dimensional image spa… (see more)ces, pose significant challenges for traditional uncertainty estimation methods due to computational demands. In this work, we introduce an innovative framework, Diffusion Ensembles for Capturing Uncertainty (DECU), designed for estimating epistemic uncertainty for diffusion models. The DECU framework introduces a novel method that efficiently trains ensembles of conditional diffusion models by incorporating a static set of pre-trained parameters, drastically reducing the computational burden and the number of parameters that require training. Additionally, DECU employs Pairwise-Distance Estimators (PaiDEs) to accurately measure epistemic uncertainty by evaluating the mutual information between model outputs and weights in high-dimensional spaces. The effectiveness of this framework is demonstrated through experiments on the ImageNet dataset, highlighting its capability to capture epistemic uncertainty, specifically in under-sampled image classes.

2024-06-05

ArXiv (preprint)

Imitation Learning from Observation through Optimal Transport

Wei-Di Chang

Scott Fujimoto

Gregory Dudek

2024-05-14

rl-conference.cc/RLC/2024/Conference (published)

Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of Adverbs

Faraz Lotfi

Farnoosh Faraji

Nikhil Kakodkar

Travis Manderson

Gregory Dudek

2024-04-02

ArXiv (preprint)

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Prakash Panangaden

Sahand Rezaei-Shoshtari

Uncertainty-aware hybrid paradigm of nonlinear MPC and model-based RL for offroad navigation: Exploration of transformers in the predictive model

Faraz Lotfi

Khalil Virji

Farnoosh Faraji

Lucas Berry

Andrew Holliday

Gregory Dudek

In this paper, we investigate a hybrid scheme that combines nonlinear model predictive control (MPC) and model-based reinforcement learning … (see more)(RL) for navigation planning of an autonomous model car across offroad, unstructured terrains without relying on predefined maps. Our innovative approach takes inspiration from BADGR, an LSTM-based network that primarily concentrates on environment modeling, but distinguishes itself by substituting LSTM modules with transformers to greatly elevate the performance our model. Addressing uncertainty within the system, we train an ensemble of predictive models and estimate the mutual information between model weights and outputs, facilitating dynamic horizon planning through the introduction of variable speeds. Further enhancing our methodology, we incorporate a nonlinear MPC controller that accounts for the intricacies of the vehicle's model and states. The model-based RL facet produces steering angles and quantifies inherent uncertainty. At the same time, the nonlinear MPC suggests optimal throttle settings, striking a balance between goal attainment speed and managing model uncertainty influenced by velocity. In the conducted studies, our approach excels over the existing baseline by consistently achieving higher metric values in predicting future events and seamlessly integrating the vehicle's kinematic model for enhanced decision-making. The code and the evaluation data are available at https://github.com/FARAZLOTFI/offroad_autonomous_navigation/).

2024-01-01

ICRA (published)

For SALE: State-Action Representation Learning for Deep Reinforcement Learning

Scott Fujimoto

Wei-Di Chang

Edward J. Smith

Shixiang Shane Gu

Doina Precup

In the field of reinforcement learning (RL), representation learning is a proven tool for complex image-based tasks, but is often overlooked… (see more) for environments with low-level states, such as physical control problems. This paper introduces SALE, a novel approach for learning embeddings that model the nuanced interaction between state and action, enabling effective representation learning from low-level states. We extensively study the design space of these embeddings and highlight important design considerations. We integrate SALE and an adaptation of checkpoints for RL into TD3 to form the TD7 algorithm, which significantly outperforms existing continuous control algorithms. On OpenAI gym benchmark tasks, TD7 has an average performance gain of 276.7% and 50.7% over TD3 at 300k and 5M time steps, respectively, and works in both the online and offline settings.