Finite Sample Complexity Analysis of Binary Segmentation
Binary segmentation is the classic greedy algorithm which recursively splits a sequential data set by optimizing some loss or likelihood fun… (voir plus)ction. Binary segmentation is widely used for changepoint detection in data sets measured over space or time, and as a sub-routine for decision tree learning. In theory it should be extremely fast for
Inferring electric vehicle charging patterns from smart meter data for impact studies
Feng Li
Élodie Campeau
Ilhan Kocar
Inferring electric vehicle charging patterns from smart meter data for impact studies
Feng Li
Élodie Campeau
Ilhan Kocar
Inferring electric vehicle charging patterns from smart meter data for impact studies
Feng Li
Élodie Campeau
Ilhan Kocar
Innovative transfusion strategies for blood deserts in disaster settings
Shreenik Kundu
Ayla Gerk
Robert Glatter
Long-term outcomes of critically ill patients with hematological malignancies: what is the impact of the coronavirus disease 2019 pandemic? Author's reply
Laveena Munshi
Sangeeta Mehta
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
Claas Voelcker
Marcel Hussing
Eric Eaton
Amir-massoud Farahmand
Igor Gilitschenski
Building deep reinforcement learning (RL) agents that find a good policy with few samples has proven notoriously challenging. To achieve sam… (voir plus)ple efficiency, recent work has explored updating neural networks with large numbers of gradient steps for every new sample. While such high update-to-data (UTD) ratios have shown strong empirical performance, they also introduce instability to the training process. Previous approaches need to rely on periodic neural network parameter resets to address this instability, but restarting the training process is infeasible in many real-world applications and requires tuning the resetting interval. In this paper, we focus on one of the core difficulties of stable training with limited samples: the inability of learned value functions to generalize to unobserved on-policy actions. We mitigate this issue directly by augmenting the off-policy RL training process with a small amount of data generated from a learned world model. Our method, Model-Augmented Data for Temporal Difference learning (MAD-TD) uses small amounts of generated data to stabilize high UTD training and achieve competitive performance on the most challenging tasks in the DeepMind control suite. Our experiments further highlight the importance of employing a good model to generate data, MAD-TD's ability to combat value overestimation, and its practical stability gains for continued learning.
MAP: Model Merging with Amortized Pareto Front Using Limited Computation
Lu Li
Tianyu Zhang
Zhiqi Bu
Suyuchen Wang
Huan He
Jie Fu
Yonghui Wu
Jiang Bian
Yong Chen
ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics
Siddharth Viswanath
Dhananjay Bhaskar
David R. Johnson
João Felipe Rocha
Egbert Castro
Jackson Grady
Alex T. Grigas
Michael Perlmutter
Corey S. O'Hern
Understanding the dynamic nature of protein structures is essential for comprehending their biological functions. While significant progress… (voir plus) has been made in predicting static folded structures, modeling protein motions on microsecond to millisecond scales remains challenging. To address these challenges, we introduce a novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE), which leverages the geometric scattering transform alongside transformer-based attention mechanisms to capture protein dynamics from molecular dynamics (MD) simulations. ProtSCAPE utilizes the multi-scale nature of the geometric scattering transform to extract features from protein structures conceptualized as graphs and integrates these features with dual attention structures that focus on residues and amino acid signals, generating latent representations of protein trajectories. Furthermore, ProtSCAPE incorporates a regression head to enforce temporally coherent latent representations.
Robust Guided Diffusion for Offline Black-Box Optimization
Can Chen
Christopher Beckham
Zixuan Liu
Offline black-box optimization aims to maximize a black-box function using an offline dataset of designs and their measured properties. Two … (voir plus)main approaches have emerged: the forward approach, which learns a mapping from input to its value, thereby acting as a proxy to guide optimization, and the inverse approach, which learns a mapping from value to input for conditional generation. (a) Although proxy-free~(classifier-free) diffusion shows promise in robustly modeling the inverse mapping, it lacks explicit guidance from proxies, essential for generating high-performance samples beyond the training distribution. Therefore, we propose \textit{proxy-enhanced sampling} which utilizes the explicit guidance from a trained proxy to bolster proxy-free diffusion with enhanced sampling control. (b) Yet, the trained proxy is susceptible to out-of-distribution issues. To address this, we devise the module \textit{diffusion-based proxy refinement}, which seamlessly integrates insights from proxy-free diffusion back into the proxy for refinement. To sum up, we propose \textit{\textbf{R}obust \textbf{G}uided \textbf{D}iffusion for Offline Black-box Optimization}~(\textbf{RGD}), combining the advantages of proxy~(explicit guidance) and proxy-free diffusion~(robustness) for effective conditional generation. RGD achieves state-of-the-art results on various design-bench tasks, underscoring its efficacy. Our code is at https://github.com/GGchen1997/RGD.
Single-Shot Learning of Stable Dynamical Systems for Long-Horizon Manipulation Tasks
Alexandre St-Aubin
Amin Abyaneh
Mastering complex sequential tasks continues to pose a significant challenge in robotics. While there has been progress in learning long-hor… (voir plus)izon manipulation tasks, most existing approaches lack rigorous mathematical guarantees for ensuring reliable and successful execution. In this paper, we extend previous work on learning long-horizon tasks and stable policies, focusing on improving task success rates while reducing the amount of training data needed. Our approach introduces a novel method that (1) segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals, and (2) learns globally stable dynamical system policies to guide the robot to each subgoal, even in the face of sensory noise and random disturbances. We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms. Code is available at https://github.com/Alestaubin/stable-imitation-policy-with-waypoints
SOAK: Same/Other/All K-fold cross-validation for estimating similarity of patterns in data subsets
Gabrielle Thibault
C. S. Bodine
Paul Nelson Arellano
Alexander F Shenkin
Olivia J. Lindly
In many real-world applications of machine learning, we are interested to know if it is possible to train on the data that we have gathered … (voir plus)so far, and obtain accurate predictions on a new test data subset that is qualitatively different in some respect (time period, geographic region, etc). Another question is whether data subsets are similar enough so that it is beneficial to combine subsets during model training. We propose SOAK, Same/Other/All K-fold cross-validation, a new method which can be used to answer both questions. SOAK systematically compares models which are trained on different subsets of data, and then used for prediction on a fixed test subset, to estimate the similarity of learnable/predictable patterns in data subsets. We show results of using SOAK on six new real data sets (with geographic/temporal subsets, to check if predictions are accurate on new subsets), 3 image pair data sets (subsets are different image types, to check that we get smaller prediction error on similar images), and 11 benchmark data sets with predefined train/test splits (to check similarity of predefined splits).