2021-12
Gradient Starvation: A Learning Proclivity in Neural Networks
Flexible Option Learning
On the Expressivity of Markov Reward
NEURIPS 2021
(2021-12-06)
proceedings.neurips.ccPDF[Also on arXiv preprint arXiv:2111.00876 (2021-11-01)]Temporally Abstract Partial Models
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning
Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation
2021-11
Estimating treatment effect for individuals with progressive multiple sclerosis using deep learning
2021-10
Temporal Abstraction in Reinforcement Learning with the Successor Representation.
Reward is enough
2021-09
Is Heterophily A Real Nightmare For Graph Neural Networks To Do Node Classification
Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning.
A Survey of Exploration Methods in Reinforcement Learning.
2021-08
Policy Gradients Incorporating the Future
2021-07
Preferential Temporal Difference Learning
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
Randomized Exploration in Reinforcement Learning with General Value Function Approximation
Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards
2021-06
Randomized Exploration for Reinforcement Learning with General Value Function Approximation
Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Reinforcement Learning.
Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Offline RL.
2021-05
AndroidEnv: A Reinforcement Learning Platform for Android.
Practical Marginalized Importance Sampling with the Successor Representation
Correcting Momentum in Temporal Difference Learning
Offline Policy Optimization with Variance Regularization
Conditional Networks
2021-04
What is Going on Inside Recurrent Meta Reinforcement Learning Agents
2021-03
Training a First-Order Theorem Prover from Synthetic Data
2021-02
Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata
Variance Penalized On-Policy and Off-Policy Actor-Critic
2021-01
Self-Supervised Attention-Aware Reinforcement Learning.
Safe Option-Critic: Learning Safety in the Option-Critic Architecture
Knowledge Engineering Review
(2021-01-01)
www.cambridge.orgPDF[Also on arXiv preprint arXiv:1807.08060 (2018-07-21)]2020-12
Towards Continual Reinforcement Learning: A Review and Perspectives.
Phylogenetic Manifold Regularization: A semi-supervised approach to predict transcription factor binding sites
Fast reinforcement learning with generalized policy updates
Proceedings of the National Academy of Sciences of the United States of America
(2020-12-01)
europepmc.orgPDF2020-11
Gradient Starvation: A Learning Proclivity in Neural Networks
Diversity-Enriched Option-Critic.
A Study of Policy Gradient on a Class of Exactly Solvable Models.
2020-10
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning.
A Fully Tensorized Recurrent Neural Network.
2020-09
Keynote Lecture Building Knowledge For AI AgentsWith Reinforcement Learning
2020-08
Training Matters: Unlocking Potentials of Deeper Graph Convolutional Neural Networks. (arXiv:2008.08838v1 [cs.LG])
arXiv Computer Science
(2020-08-21)
Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks.
Training Matters: Unlocking Potentials of Deeper Graph Convolutional Neural Networks.
2020-07
What can I do here? A Theory of Affordances in Reinforcement Learning
Invariant Causal Prediction for Block MDPs
Interference and Generalization in Temporal Difference Learning
SVRG for Policy Evaluation with Fewer Gradient Evaluations
2020-06
Learning to Prove from Synthetic Theorems.
A Brief Look at Generalization in Visual Meta-Reinforcement Learning
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms.
AISTATS 2020
(2020-06-03)
proceedings.mlr.pressPDF[Also on arXiv preprint arXiv:2003.12239 (2020-03-27)]Value Preserving State-Action Abstractions
2020-05
META-Learning State-based Eligibility Traces for More Sample-Efficient Policy Evaluation
Option-Critic in Cooperative Multi-agent Systems
Gifting in Multi-Agent Reinforcement Learning
2020-04
Gifting in Multi-Agent Reinforcement Learning (Student Abstract)
Algorithmic Improvements for Deep Reinforcement Learning Applied to Interactive Fiction.
Options of Interest: Temporal Abstraction with Interest Functions
2020-03
Invariant Causal Prediction for Block MDPs.
Multiple Kernel Learning-Based Transfer Regression for Electric Load Forecasting
2020-02
Policy Evaluation Networks.
oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions.
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces.
Provably efficient reconstruction of policy networks.
Assessment of Extubation Readiness Using Spontaneous Breathing Trials in Extremely Preterm Neonates.
2020-01
Exploring Bayesian Deep Learning Uncertainty Measures for Segmentation of New Lesions in Longitudinal MRIs
On Efficiency in Hierarchical Reinforcement Learning
Forethought and Hindsight in Credit Assignment
Reward Propagation Using Graph Convolutional Networks
Learning to cooperate: Emergent communication in multi-agent navigation.
Cognitive Science
(2020-01-01)
cogsci.mindmodeling.orgPDF[LATEST on arXiv preprint arXiv:2004.01097 (2020-04-02)]Exploring uncertainty measures in deep networks for Multiple sclerosis lesion detection and segmentation.
Value-driven Hindsight Modelling
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay
2019-12
Shaping representations through communication: community size effect in artificial learning systems
Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning.
Marginalized State Distribution Entropy Regularization in Policy Optimization
Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
Hindsight Credit Assignment
2019-11
Option-Critic in Cooperative Multi-agent Systems
Sidewalk Environment for Visual Navigation
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
AISTATS 2019
(2019-11-01)
proceedings.mlr.press[LATEST on arXiv preprint arXiv:1911.05010 (2019-11-12)]2019-10
Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments
Conference on Robot Learning
(2019-10-29)
proceedings.mlr.pressPDF[Also on arXiv preprint arXiv:1910.13249 (2019-10-29)]Actor Critic with Differentially Private Critic.
Improving Pathological Structure Segmentation via Transfer Learning Across Diseases
Early Prediction of Alzheimer's Disease Progression Using Variational Autoencoders.
Recurrent Value Functions. (arXiv:1905.09562v1 [cs.LG])
arXiv Computer Science
(2019-10-07)
Singular value automata and approximate minimization
Augmenting learning using symmetry in a biologically-inspired domain
2019-09
Assessing Generalization in TD methods for Deep Reinforcement Learning
Avoidance Learning Using Observational Reinforcement Learning
Revisit Policy Optimization in Matrix Form.
2019-07
An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation.
Learning Options with Interest Functions
Leveraging Observations in Bandits: Between Risks and Benefits
Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning
2019-06
Neural Transfer Learning for Cry-based Diagnosis of Perinatal Asphyxia
2019-05
Singular value automata and approximate minimization
Off-Policy Deep Reinforcement Learning without Exploration
Per-Decision Option Discounting
Prediction of Disease Progression in Multiple Sclerosis Patients using Deep Learning Analysis of MRI Data
Building Knowledge for AI Agents with Reinforcement Learning
Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks
2019-04
Faster and More Accurate Trace-based Policy Evaluation via Overall Target Error Meta-Optimization
META-Learning State-based {\lambda} for More Sample-Efficient Policy Evaluation
The Termination Critic
AISTATS 2019
(2019-04-11)
ui.adsabs.harvard.eduPDF[Also on arXiv preprint arXiv:1902.09996 (2019-02-26)]2019-03
Learning proposals for sequential importance samplers using reinforced variational inference.
2019-02
The Impact of Time Interval between Extubation and Reintubation on Death or Bronchopulmonary Dysplasia in Extremely Preterm Infants.
2019-01
The Option Keyboard: Combining Skills in Reinforcement Learning
Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks
Community size effect in artificial learning systems.
Learning Reliable Policies in the Bandit Setting with Application to Adaptive Clinical Trials.
Temporally Extended Metrics for Markov Decision Processes.
Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials.
2018-12
Clustering-Oriented Representation Learning with Attractive-Repulsive Loss.
Prediction of Progression in Multiple Sclerosis Patients
International Conference on Medical Imaging with Deep Learning -- Full Paper Track
(2018-12-13)
openreview.netPDF2018-11
Environments for Lifelong Reinforcement Learning.
The Barbados 2018 List of Open Issues in Continual Learning.
Temporal Regularization in Markov Decision Process
2018-09
Where Off-Policy Deep Reinforcement Learning Fails
Shaping representations through communication
2018-08
A Semi-Markov Chain Approach to Modeling Respiratory Patterns Prior to Extubation in Preterm Infants.
Predicting Extubation Readiness in Extreme Preterm Infants based on Patterns of Breathing
Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants
2018-07
Attend Before you Act: Leveraging human visual attention for continual learning.
Leveraging Observational Learning for Exploration in Bandits
Eligibility Traces for Options
Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning
AISTATS 2018
(2018-07-04)
proceedings.mlr.pressPDF[Also on arXiv preprint arXiv:1807.01406 (2018-07-04)]Convergent Tree Backup and Retrace with Function Approximation
2018-06
Diffusion-Based Approximate Value Functions
Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization.
2018-05
Dyna Planning using a Feature Based Generative Model.
Learning Safe Policies with Expert Guidance
2018-03
Nonlinear Weighted Finite Automata
Constructing Temporal Abstractions Autonomously in Reinforcement Learning
2018-02
Disentangling the independently controllable factors of variation by interacting with the world
Learning with Options that Terminate Off-Policy
When Waiting is not an Option : Learning Options with a Deliberation Cost
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Deep Reinforcement Learning that Matters
2018-01
Patterns of reintubation in extremely preterm infants: a longitudinal cohort study
Learning Predictive State Representations From Non-Uniform Sampling.
Imitation Upper Confidence Bound for Bandits on a Graph.
Learning Robust Options.
Temporal Regularization for Markov Decision Process
Publications collected and formatted using Paperoni