Publications

Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning
Maziar Gomrokchi
Susan Amin
Hossein Aboutalebi
Alexander Wong
While significant research advances have been made in the field of deep reinforcement learning, there have been no concrete adversarial atta… (see more)ck strategies in literature tailored for studying the vulnerability of deep reinforcement learning algorithms to membership inference attacks. In such attacking systems, the adversary targets the set of collected input data on which the deep reinforcement learning algorithm has been trained. To address this gap, we propose an adversarial attack framework designed for testing the vulnerability of a state-of-the-art deep reinforcement learning algorithm to a membership inference attack. In particular, we design a series of experiments to investigate the impact of temporal correlation, which naturally exists in reinforcement learning training data, on the probability of information leakage. Moreover, we compare the performance of collective and individual membership attacks against the deep reinforcement learning algorithm. Experimental results show that the proposed adversarial attack framework is surprisingly effective at inferring data with an accuracy exceeding 84% in individual and 97% in collective modes in three different continuous control Mujoco tasks, which raises serious privacy concerns in this regard. Finally, we show that the learning state of the reinforcement learning algorithm influences the level of privacy breaches significantly.
Meta Pseudo Labels for Anomaly Detection via Partially Observed Anomalies
Sinong Zhao
Zhaoyang Yu
Xiaofei Wang
T. Marbach
Gang Wang
X. Liu
MixupE: Understanding and Improving Mixup from Directional Derivative Perspective
Vikas Verma
Yingtian Zou
Sarthak Mittal
Wai Hoh Tang
Hieu Pham
Juho Kannala
Arno Solin
Kenji Kawaguchi
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpol… (see more)ating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.
MixupE: Understanding and improving Mixup from directional derivative perspective
Vikas Verma
Yingtian Zou
Sarthak Mittal
Wai Hoh Tang
Hieu Pham
Juho Kannala
Arno Solin
Kenji Kawaguchi
Motor cortex latent dynamics 1 encode arm movement direction and 2 urgency independently 3
Andrea Colins Rodriguez
Lee Miller
Mark D. Humphries
10 The fluid movement of an arm is controlled by multiple parameters that can be set 11 independently. Recent studies argue that arm moveme… (see more)nts are generated by the collective 12 dynamics of neurons in motor cortex. But how these collective dynamics simultaneously encode 13 and control multiple parameters of movement is an open question. Using a task where monkeys 14 made sequential, varied arm movements, we show that the direction and urgency of arm 15 movements are simultaneously encoded in the low-dimensional trajectories of population 16 activity: each movement’s direction by a fixed, looped neural trajectory and its urgency by how 17 quickly that trajectory was traversed. Network models showed this latent coding is potentially 18 advantageous as it allows the direction and urgency of arm movement to be independently 19 controlled. Our results suggest how low-dimensional neural dynamics can define multiple 20 parameters of goal-directed movement simultaneously. 21
Motor cortex latent dynamics 1 encode arm movement direction and 2 urgency independently 3
Andrea Colins Rodriguez
Lee Miller
Mark D. Humphries
10 The fluid movement of an arm is controlled by multiple parameters that can be set 11 independently. Recent studies argue that arm moveme… (see more)nts are generated by the collective 12 dynamics of neurons in motor cortex. But how these collective dynamics simultaneously encode 13 and control multiple parameters of movement is an open question. Using a task where monkeys 14 made sequential, varied arm movements, we show that the direction and urgency of arm 15 movements are simultaneously encoded in the low-dimensional trajectories of population 16 activity: each movement’s direction by a fixed, looped neural trajectory and its urgency by how 17 quickly that trajectory was traversed. Network models showed this latent coding is potentially 18 advantageous as it allows the direction and urgency of arm movement to be independently 19 controlled. Our results suggest how low-dimensional neural dynamics can define multiple 20 parameters of goal-directed movement simultaneously. 21
Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response of Residential Loads
Vincent Mai
Philippe Maisonneuve
Tianyu Zhang
Hadi Nekoei
To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale var… (see more)iations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints struggle to provide satisfactory performance for fast timescale action selection with hundreds of agents. We propose a decentralized agent trained with multi-agent proximal policy optimization with localized communication. We explore two communication frameworks: hand-engineered, or learned through targeted multi-agent communication. The resulting policies perform well and robustly for frequency regulation, and scale seamlessly to arbitrary numbers of houses for constant processing times.
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
David Venuto
Sherry Yang
Pieter Abbeel
Igor Mordatch
Ofir Nachum
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
David Venuto
Sherry Yang
Pieter Abbeel
Igor Mordatch
Ofir Nachum
Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language an… (see more)d vision applications. In reinforcement learning, however, a key challenge is that available data of sequential decision making is often not annotated with actions - for example, videos of game-play are much more available than sequences of frames paired with the logged game controls. We propose to circumvent this challenge by combining large but sparsely-annotated datasets from a \emph{target} environment of interest with fully-annotated datasets from various other \emph{source} environments. Our method, Action Limited PreTraining (ALPT), leverages the generalization capabilities of inverse dynamics modelling (IDM) to label missing action data in the target environment. We show that utilizing even one additional environment dataset of labelled data during IDM pretraining gives rise to substantial improvements in generating action labels for unannotated sequences. We evaluate our method on benchmark game-playing environments and show that we can significantly improve game performance and generalization capability compared to other approaches, even when using annotated datasets equivalent to only 12 minutes of gameplay.
Multivariate Time-Series Anomaly Detection with Temporal Self-supervision and Graphs: Application to Vehicle Failure Prediction
Hadi Hojjati
Mohammadreza Sadeghi
Neighbor Auto-Grouping Graph Neural Networks for Handover Parameter Configuration in Cellular Network
Mehrtash Mehrabi
Walid Masoudimansour
Yingxue Zhang
Jie Chuai
Zhitang Chen
Jianye Hao
Yanhui. Geng
Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization
Chris Junchi Li
Angela Yuan
Quanquan Gu
Michael Jordan
We propose a new first-order optimization algorithm --- AcceleratedGradient-OptimisticGradient (AG-OG) Descent Ascent---for separable convex… (see more)-concave minimax optimization. The main idea of our algorithm is to carefully leverage the structure of the minimax problem, performing Nesterov acceleration on the individual component and optimistic gradient on the coupling component. Equipped with proper restarting, we show that AG-OG achieves the optimal convergence rate (up to a constant) for a variety of settings, including bilinearly coupled strongly convex-strongly concave minimax optimization (bi-SC-SC), bilinearly coupled convex-strongly concave minimax optimization (bi-C-SC), and bilinear games. We also extend our algorithm to the stochastic setting and achieve the optimal convergence rate in both bi-SC-SC and bi-C-SC settings. AG-OG is the first single-call algorithm with optimal convergence rates in both deterministic and stochastic settings for bilinearly coupled minimax optimization problems.