Precision public health: Dream or reality?
Maureen Dobbins
Accelerating Smooth Games by Manipulating Spectral Shapes
Deep Active Learning: Unified and Principled Method for Query and Training
Changjian Shui
Fan Zhou
Boyu Wang
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems
Sharan Vaswani
Abbas Mehrabian
Branislav Kveton
A principled approach for generating adversarial images under non-smooth dissimilarity metrics
Aram-Alexandre Pooladian
Chris Finlay
Tim Hoheisel
A Reduction from Reinforcement Learning to No-Regret Online Learning
Ching-An Cheng
Remi Tachet des Combes
Byron Boots
We present a reduction from reinforcement learning (RL) to no-regret online learning based on the saddle-point formulation of RL, by which "… (voir plus)any" online algorithm with sublinear regret can generate policies with provable performance guarantees. This new perspective decouples the RL problem into two parts: regret minimization and function approximation. The first part admits a standard online-learning analysis, and the second part can be quantified independently of the learning algorithm. Therefore, the proposed reduction can be used as a tool to systematically design new RL algorithms. We demonstrate this idea by devising a simple RL algorithm based on mirror descent and the generative-model oracle. For any
Stochastic Neural Network with Kronecker Flow
Chin-Wei Huang
Ahmed Touati
Alexandre Lacoste
Recent advances in variational inference enable the modelling of highly structured joint distributions, but are limited in their capacity to… (voir plus) scale to the high-dimensional setting of stochastic neural networks. This limitation motivates a need for scalable parameterizations of the noise generation process, in a manner that adequately captures the dependencies among the various parameters. In this work, we address this need and present the Kronecker Flow, a generalization of the Kronecker product to invertible mappings designed for stochastic neural networks. We apply our method to variational Bayesian neural networks on predictive tasks, PAC-Bayes generalization bound estimation, and approximate Thompson sampling in contextual bandits. In all setups, our methods prove to be competitive with existing methods and better than the baselines.
Value Preserving State-Action Abstractions
David Abel
Nathan Umbanhowar
Dilip Arumugam
Michael L. Littman
Abstraction can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information… (voir plus), potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve the representation of near-optimal policies. We first define φ-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for φ-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, φ-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.ion can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information, potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve the representation of near-optimal policies. We first define φ-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for φ-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, φ-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.
Value Preserving State-Action Abstractions
David Abel
Nathan Umbanhowar
Dilip Arumugam
Michael L. Littman
Abstraction can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information… (voir plus), potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve the representation of near-optimal policies. We first define φ-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for φ-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, φ-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.ion can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information, potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve the representation of near-optimal policies. We first define φ-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for φ-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, φ-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.
The need for privacy with public digital contact tracing during the COVID-19 pandemic
Richard Janda
Yun William Yu
Daphne Ippolito
Max Jarvie
Dan Pilat
Brooke Struck
Sekoul Krastev
Abhinav Sharma
Training End-to-End Analog Neural Networks with Equilibrium Propagation
Jack D. Kendall
Ross D. Pantone
Kalpana Manickavasagam
Benjamin Scellier
We introduce a principled method to train end-to-end analog neural networks by stochastic gradient descent. In these analog neural networks,… (voir plus) the weights to be adjusted are implemented by the conductances of programmable resistive devices such as memristors [Chua, 1971], and the nonlinear transfer functions (or `activation functions') are implemented by nonlinear components such as diodes. We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models: they possess an energy function as a consequence of Kirchhoff's laws governing electrical circuits. This property enables us to train them using the Equilibrium Propagation framework [Scellier and Bengio, 2017]. Our update rule for each conductance, which is local and relies solely on the voltage drop across the corresponding resistor, is shown to compute the gradient of the loss function. Our numerical simulations, which use the SPICE-based Spectre simulation framework to simulate the dynamics of electrical circuits, demonstrate training on the MNIST classification task, performing comparably or better than equivalent-size software-based neural networks. Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.