InfoBot: Structured Exploration in ReinforcementLearning Using Information Bottleneck
Anirudh Goyal
Riashat Islam
D. Strouse
Zafarali Ahmed
Matthew Botvinick
Sergey Levine
InfoBot: Transfer and Exploration via the Information Bottleneck
Anirudh Goyal
Riashat Islam
DJ Strouse
Zafarali Ahmed
Matthew Botvinick
Sergey Levine
A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postula… (voir plus)te that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned policy with an information bottleneck, we can identify decision states by examining where the model actually leverages the goal state. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.
Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy
Alex Lamb
Vikas Verma
Juho Kannala
Adversarial robustness has become a central goal in deep learning, both in theory and practice. However, successful methods to improve adver… (voir plus)sarial robustness (such as adversarial training) greatly hurt generalization performance on the clean data. This could have a major impact on how adversarial robustness affects real world systems (i.e. many may opt to forego robustness if it can improve performance on the clean data). We propose Interpolated Adversarial Training, which employs recently proposed interpolation based training methods in the framework of adversarial training. On CIFAR-10, adversarial training increases clean test error from 5.8% to 16.7%, whereas with our Interpolated adversarial training we retain adversarial robustness while achieving a clean test error of only 6.5%. With our technique, the relative error increase for the robust model is reduced from 187.9% to just 12.1%.
Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy
Alex Lamb
Vikas Verma
Juho Kannala
Adversarial robustness has become a central goal in deep learning, both in theory and practice. However, successful methods to improve adver… (voir plus)sarial robustness (such as adversarial training) greatly hurt generalization performance on the clean data. This could have a major impact on how adversarial robustness affects real world systems (i.e. many may opt to forego robustness if it can improve performance on the clean data). We propose Interpolated Adversarial Training, which employs recently proposed interpolation based training methods in the framework of adversarial training. On CIFAR-10, adversarial training increases clean test error from 5.8% to 16.7%, whereas with our Interpolated adversarial training we retain adversarial robustness while achieving a clean test error of only 6.5%. With our technique, the relative error increase for the robust model is reduced from 187.9% to just 12.1%.
Lagrangian neurodynamics for real-time error-backpropagation across cortical areas
Dominik Dold
Akos F. Kungl
João Sacramento
Mihai A. Petrovici
Kaspar Schindler
Jonathan Binas
Walter Senn
.
Learning Brain Dynamics from Calcium Imaging with Coupled van der Pol and LSTM
Germán Abrevaya
Aleksandr Y. Aravkin
Guillermo Cecchi
James Kozloski
Pablo Polosecki
Peng Zheng
Silvina Ponce Dawson
Juliana Y. Rhee
David Daniel Cox
Many real-world data sets, especially in biology, are produced by complex nonlinear dynamical systems. In this paper, we focus on brain calc… (voir plus)ium imaging (CaI) of different organisms (zebrafish and rat), aiming to build a model of joint activation dynamics in large neuronal populations, including the whole brain of zebrafish. We propose a new approach for capturing dynamics of temporal SVD components that uses the coupled (multivariate) van der Pol (VDP) oscillator, a nonlinear ordinary differential equation (ODE) model describing neural activity, with a new parameter estimation technique that combines variable projection optimization and stochastic search. We show that the approach successfully handles nonlinearities and hidden state variables in the coupled VDP. The approach is accurate, achieving 0.82 to 0.94 correlation between the actual and model-generated components, and interpretable, as VDP’s coupling matrix reveals anatomically meaningful positive (excitatory) and negative (inhibitory) interactions across different brain subsystems corresponding to spatial SVD components. Moreover, VDP is comparable to (or sometimes better than) recurrent neural networks (LSTM) for (short-term) prediction of future brain activity; VDP needs less parameters to train, which was a plus on our small training data. Finally, the overall best predictive method, greatly outperforming both VDP and LSTM in shortand long-term predicitve settings on both datasets, was the new hybrid VDP-LSTM approach that used VDP to simulate large domain-specific dataset for LSTM pretraining; note that simple LSTM data-augmentation via noisy versions of training data was much less effective.
Learning deep representations by mutual information estimation and maximization
Alex Fedorov
Samuel Lavoie-Marchildon
Karan Grewal
Adam Trischler
Phil Bachman
This work investigates unsupervised learning of representations by maximizing mutual information between an input and the output of a deep n… (voir plus)eural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality in the input into the objective can significantly improve a representation’s suitability for downstream tasks. We further control characteristics of the representation by matching to a prior distribution adversarially. Our method, which we call Deep InfoMax (DIM), outperforms a number of popular unsupervised learning methods and compares favorably with fully-supervised learning on several classification tasks in with some standard architectures. DIM opens new avenues for unsupervised learning of representations and is an important step towards flexible formulations of representation learning objectives for specific end-goals.
Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference
Matthew D Riemer
Ignacio Cases
Robert Ajemian
Miao Liu
Yuhai Tu
Gerald Tesauro
Lack of performance when it comes to continual learning over non-stationary distributions of data remains a major challenge in scaling neura… (voir plus)l network learning to more human realistic settings. In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples. We then propose a new algorithm, Meta-Experience Replay (MER), that directly exploits this view by combining experience replay with optimization based meta-learning. This method learns parameters that make interference based on future gradients less likely and transfer based on future gradients more likely. We conduct experiments across continual lifelong supervised learning benchmarks and non-stationary reinforcement learning environments demonstrating that our approach consistently outperforms recently proposed baselines for continual learning. Our experiments show that the gap between the performance of MER and baseline algorithms grows both as the environment gets more non-stationary and as the fraction of the total experiences stored gets smaller.
LF-PPL: A Low-Level First Order Probabilistic Programming Language for Non-Differentiable Models
Yuanshuo Zhou
Bradley Gram-Hansen
Tobias Kohn
Tom Rainforth
Hongseok Yang
F. Wood
We develop a new Low-level, First-order Probabilistic Programming Language~(LF-PPL) suited for models containing a mix of continuous, discre… (voir plus)te, and/or piecewise-continuous variables. The key success of this language and its compilation scheme is in its ability to automatically distinguish parameters the density function is discontinuous with respect to, while further providing runtime checks for boundary crossings. This enables the introduction of new inference engines that are able to exploit gradient information, while remaining efficient for models which are not everywhere differentiable. We demonstrate this ability by incorporating a discontinuous Hamiltonian Monte Carlo (DHMC) inference engine that is able to deliver automated and efficient inference for non-differentiable models. Our system is backed up by a mathematical formalism that ensures that any model expressed in this language has a density with measure zero discontinuities to maintain the validity of the inference engine.
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
Thibault De Boissière
Lucas Gestin
Wei Zhen Teoh
Jose Sotelo
Alexandre De Brébisson
Neural Multisensory Scene Inference
Jae Hyun Lim
Pedro O. Pinheiro
Sungjin Ahn
For embodied agents to infer representations of the underlying 3D physical world they inhabit, they should efficiently combine multisensory … (voir plus)cues from numerous trials, e.g., by looking at and touching objects. Despite its importance, multisensory 3D scene representation learning has received less attention compared to the unimodal setting. In this paper, we propose the Generative Multisensory Network (GMN) for learning latent representations of 3D scenes which are partially observable through multiple sensory modalities. We also introduce a novel method, called the Amortized Product-of-Experts, to improve the computational efficiency and the robustness to unseen combinations of modalities at test time. Experimental results demonstrate that the proposed model can efficiently infer robust modality-invariant 3D-scene representations from arbitrary combinations of modalities and perform accurate cross-modal generation. To perform this exploration we have also developed a novel multi-sensory simulation environment for embodied agents.
Ordered Memory
Yikang Shen
Shawn Tan
Arian Hosseini
Zhouhan Lin
Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficult… (voir plus)y of training memory models remains a problem obstructing the widespread use of such models. In this paper, we propose the Ordered Memory architecture. Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory. We also introduce a new Gated Recursive Cell to compose lower-level representations into higher-level representation. We demonstrate that our model achieves strong performance on the logical inference task (Bowman et al., 2015) and the ListOps (Nangia and Bowman, 2018) task. We can also interpret the model to retrieve the induced tree structure, and find that these induced structures align with the ground truth. Finally, we evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013), and find that it performs comparatively with the state-of-the-art methods in the literature.