2020-12
Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards.
2020-11
Gradient Starvation: A Learning Proclivity in Neural Networks
2020-10
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces.
2020-09
Keynote Lecture Building Knowledge For AI AgentsWith Reinforcement Learning
2020-08
Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks
Training Matters: Unlocking Potentials of Deeper Graph Convolutional Neural Networks.
Fast reinforcement learning with generalized policy updates
Proceedings of the National Academy of Sciences of the United States of America
(2020-08-17)
syndication.highwire.org2020-07
What can I do here? A Theory of Affordances in Reinforcement Learning
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay.
Invariant Causal Prediction for Block MDPs
Interference and Generalization in Temporal Difference Learning
SVRG for Policy Evaluation with Fewer Gradient Evaluations
2020-06
Learning to Prove from Synthetic Theorems.
A Brief Look at Generalization in Visual Meta-Reinforcement Learning
2020-04
Gifting in Multi-Agent Reinforcement Learning (Student Abstract).
Learning to cooperate: Emergent communication in multi-agent navigation.
2020-03
Multiple Kernel Learning-Based Transfer Regression for Electric Load Forecasting
2020-02
Policy Evaluation Networks.
oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions.
Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction
Options of Interest: Temporal Abstraction with Interest Functions
Provably efficient reconstruction of policy networks.
Assessment of Extubation Readiness Using Spontaneous Breathing Trials in Extremely Preterm Neonates.
2020-01
Forethought and Hindsight in Credit Assignment
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay
Reward Propagation Using Graph Convolutional Networks
On Efficiency in Hierarchical Reinforcement Learning
Value-driven Hindsight Modelling
META-Learning State-based Eligibility Traces for More Sample-Efficient Policy Evaluation.
Autonomous Agents and Multi-Agent Systems
(2020-01-01)
dblp.uni-trier.de[LATEST on arXiv: Learning (2020-05-18)]Exploring uncertainty measures in deep networks for Multiple sclerosis lesion detection and segmentation.
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms.
Gifting in Multi-Agent Reinforcement Learning.
Value Preserving State-Action Abstractions.
2019-12
Shaping representations through communication: community size effect in artificial learning systems
Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning.
Marginalized State Distribution Entropy Regularization in Policy Optimization
Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
The Option Keyboard: Combining Skills in Reinforcement Learning
Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks
Hindsight Credit Assignment
2019-11
Option-critic in cooperative multi-agent systems
Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction
Option-critic in cooperative multi-agent systems
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
AISTATS 2019
(2019-11-01)
proceedings.mlr.press[LATEST on arXiv preprint arXiv:1911.05010 (2019-11-12)]2019-10
Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments
Actor Critic with Differentially Private Critic.
Improving Pathological Structure Segmentation via Transfer Learning Across Diseases
Early Prediction of Alzheimer's Disease Progression Using Variational Autoencoders.
Singular value automata and approximate minimization
Augmenting learning using symmetry in a biologically-inspired domain
2019-09
Value-driven Hindsight Modelling.
Assessing Generalization in TD methods for Deep Reinforcement Learning
Avoidance Learning Using Observational Reinforcement Learning
Revisit Policy Optimization in Matrix Form.
2019-07
An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation.
Learning Options with Interest Functions
Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning
2019-06
Neural Transfer Learning for Cry-based Diagnosis of Perinatal Asphyxia
Off-Policy Deep Reinforcement Learning without Exploration
Per-Decision Option Discounting
2019-05
Singular value automata and approximate minimization
Prediction of Disease Progression in Multiple Sclerosis Patients using Deep Learning Analysis of MRI Data
Recurrent Value Functions.
Building Knowledge for AI Agents with Reinforcement Learning
Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks
2019-04
Faster and More Accurate Trace-based Policy Evaluation via Overall Target Error Meta-Optimization
META-Learning State-based {\lambda} for More Sample-Efficient Policy Evaluation
The Termination Critic
AISTATS 2019
(2019-04-11)
proceedings.mlr.pressPDF[Also on arXiv preprint arXiv:1902.09996 (2019-02-26)]2019-03
Learning proposals for sequential importance samplers using reinforced variational inference.
2019-02
The Impact of Time Interval between Extubation and Reintubation on Death or Bronchopulmonary Dysplasia in Extremely Preterm Infants.
2019-01
Leveraging observations in bandits: Between risks and benefits
Combined Reinforcement Learning via Abstract Representations
Community size effect in artificial learning systems.
Learning Reliable Policies in the Bandit Setting with Application to Adaptive Clinical Trials.
Temporally Extended Metrics for Markov Decision Processes.
Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials.
2018-12
Clustering-Oriented Representation Learning with Attractive-Repulsive Loss.
Prediction of Progression in Multiple Sclerosis Patients
International Conference on Medical Imaging with Deep Learning -- Full Paper Track
(2018-12-13)
openreview.netPDFLearning safe policies with expert guidance
Temporal Regularization for Markov Decision Process
2018-11
Environments for Lifelong Reinforcement Learning.
The Barbados 2018 List of Open Issues in Continual Learning.
Temporal Regularization in Markov Decision Process
2018-09
Where Off-Policy Deep Reinforcement Learning Fails
Shaping representations through communication
2018-08
A Semi-Markov Chain Approach to Modeling Respiratory Patterns Prior to Extubation in Preterm Infants.
Predicting Extubation Readiness in Extreme Preterm Infants based on Patterns of Breathing
2018-07
Attend Before you Act: Leveraging human visual attention for continual learning.
Safe Option-Critic: Learning Safety in the Option-Critic Architecture
Convergent Tree-Backup and Retrace with Function Approximation
Leveraging Observational Learning for Exploration in Bandits
Eligibility Traces for Options
Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning
AISTATS 2018
(2018-07-04)
proceedings.mlr.pressPDF[Also on arXiv preprint arXiv:1807.01406 (2018-07-04)]Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants
2018-06
Diffusion-Based Approximate Value Functions
2018-05
Dyna Planning using a Feature Based Generative Model.
2018-03
Nonlinear Weighted Finite Automata
Constructing Temporal Abstractions Autonomously in Reinforcement Learning
2018-02
Disentangling the independently controllable factors of variation by interacting with the world
Learning with Options that Terminate Off-Policy
When Waiting is not an Option : Learning Options with a Deliberation Cost
Learning Predictive State Representations from Non-uniform Sampling
Learning Robust Options
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Deep Reinforcement Learning that Matters
2018-01
Patterns of reintubation in extremely preterm infants: a longitudinal cohort study.
Imitation Upper Confidence Bound for Bandits on a Graph.
Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization.
Publications collected and formatted using Paperoni