Safe option-critic: learning safety in the option-critic architecture
Abstract Designing hierarchical reinforcement learning algorithms that exhibit safe behaviour is not only vital for practical applications b… (voir plus)ut also facilitates a better understanding of an agent’s decisions. We tackle this problem in the options framework (Sutton, Precup & Singh, 1999), a particular way to specify temporally abstract actions which allow an agent to use sub-policies with start and end conditions. We consider a behaviour as safe that avoids regions of state space with high uncertainty in the outcomes of actions. We propose an optimization objective that learns safe options by encouraging the agent to visit states with higher behavioural consistency. The proposed objective results in a trade-off between maximizing the standard expected return and minimizing the effect of model uncertainty in the return. We propose a policy gradient algorithm to optimize the constrained objective function. We examine the quantitative and qualitative behaviours of the proposed approach in a tabular grid world, continuous-state puddle world, and three games from the Arcade Learning Environment: Ms. Pacman, Amidar, and Q*Bert. Our approach achieves a reduction in the variance of return, boosts performance in environments with intrinsic variability in the reward structure, and compares favourably both with primitive actions and with risk-neutral options.
Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot Classification Benchmark
Vincent Dumoulin
Neil Houlsby
Utku Evci
Xiaohua Zhai
Sylvain Gelly
Meta and transfer learning are two successful families of approaches to few-shot learning. Despite highly related goals, state-of-the-art ad… (voir plus)vances in each family are measured largely in isolation of each other. As a result of diverging evaluation norms, a direct or thorough comparison of different approaches is challenging. To bridge this gap, we perform a cross-family study of the best transfer and meta learners on both a large-scale meta-learning benchmark (Meta-Dataset, MD), and a transfer learning benchmark (Visual Task Adaptation Benchmark, VTAB). We find that, on average, large-scale transfer methods (Big Transfer, BiT) outperform competing approaches on MD, even when trained only on ImageNet. In contrast, meta-learning approaches struggle to compete on VTAB when trained and validated on MD. However, BiT is not without limitations, and pushing for scale does not improve performance on highly out-of-distribution MD tasks. In performing this study, we reveal a number of discrepancies in evaluation norms and study some of these in light of the performance gap. We hope that this work facilitates sharing of insights from each community, and accelerates progress on few-shot learning.
Understanding Continual Learning Settings with Data Distribution Drift Analysis
Timothee LESORT
Massimo Caccia
Classical machine learning algorithms often assume that the data are drawn i.i.d. from a stationary probability distribution. Recently, cont… (voir plus)inual learning emerged as a rapidly growing area of machine learning where this assumption is relaxed, i.e. where the data distribution is non-stationary and changes over time. This paper represents the state of data distribution by a context variable
Discourse-Aware Unsupervised Summarization for Long Scientific Documents
Yue Dong
Andrei Mircea
Touch-based Curiosity for Sparse-Reward Tasks
Sai Rajeswar
Cyril Ibrahim
Nitin Surya
Florian Golemo
David Vazquez
Pedro O. Pinheiro
All in This Together? A Preregistered Report on Deservingness of Government Aid During the COVID-19 Pandemic
Aengus Bridgman
Eric Roman Owen Merkley
Peter John Loewen
Taylor Reid Owen
Derek Ruths
Abstract The COVID-19 pandemic has placed unprecedented pressure on governments to engage in widespread cash transfers directly to citizens … (voir plus)to help mitigate economic losses. Major and near-universal redistribution efforts have been deployed, but there is remarkably little understanding of where the mass public believes financial support is warranted. Using experimental evidence, we evaluate whether considerations related to deservingness, similarity, and prejudicial attitudes structure support for these transfers. A preregistered experiment found broad, generous, and nondiscriminatory support for direct cash transfers related to COVID-19 in Canada. The second study, accepted as a preregistered report, further probes these dynamics by comparing COVID-19-related outlays with nonemergency ones. We find that COVID-19-related spending was more universal as compared to a more generic cash allocation program. Given that the results were driven by the income of hypothetical recipients, we find broad support for disaster relief that is not means-tested or otherwise constrained by pre-disaster income.
All in This Together? A Preregistered Report on Deservingness of Government Aid During the COVID-19 Pandemic
Aengus Bridgman
Eric Merkley
Peter John Loewen
Taylor Owen
Derek Ruths
Fasting alters the gut microbiome reducing blood pressure and body weight in metabolic syndrome patients
András Maifeld
Hendrik Bartolomaeus
Ulrike Löber
Ellen G. Avery
Nico Steckhan
Lajos Markó
Nicola Wilck
Ibrahim Hamad
Urša Šušnjar
Anja Mähler
Christoph Hohmann
Chia-Yu Chen
Holger Cramer
Gustav Dobos
Till Robin Lesker
Till Strowig
Ralf Dechend
Markus Kleinewietfeld
Andreas Michalsen … (voir 2 de plus)
Dominik N. Müller
Sofia K. Forslund
Loneliness and Neurocognitive Aging
R. Nathan Spreng
Mapping gene transcription and neurocognition across human neocortex
Justine Y. Hansen
Ross D Markello
Jacob W. Vogel
Jakob Seidlitz
Bratislav Mišić
Evaluating the Integration of One Health in Surveillance Systems for Antimicrobial Use and Resistance: A Conceptual Framework
Cécile Aenishaenslin
Barbara Häsler
André Ravel
E. Jane Parmley
Sarah Mediouni
Houda Bennani
Katharina D. C. Stärk
It is now widely acknowledged that surveillance of antimicrobial resistance (AMR) must adopt a “One Health” (OH) approach to successfull… (voir plus)y address the significant threats this global public health issue poses to humans, animals, and the environment. While many protocols exist for the evaluation of surveillance, the specific aspect of the integration of a OH approach into surveillance systems for AMR and antimicrobial Use (AMU), suffers from a lack of common and accepted guidelines and metrics for its monitoring and evaluation functions. This article presents a conceptual framework to evaluate the integration of OH in surveillance systems for AMR and AMU, named the Integrated Surveillance System Evaluation framework (ISSE framework). The ISSE framework aims to assist stakeholders and researchers who design an overall evaluation plan to select the relevant evaluation questions and tools. The framework was developed in partnership with the Canadian Integrated Program for Antimicrobial Resistance Surveillance (CIPARS). It consists of five evaluation components, which consider the capacity of the system to: [1] integrate a OH approach, [2] produce OH information and expertise, [3] generate actionable knowledge, [4] influence decision-making, and [5] positively impact outcomes. For each component, a set of evaluation questions is defined, and links to other available evaluation tools are shown. The ISSE framework helps evaluators to systematically assess the different OH aspects of a surveillance system, to gain comprehensive information on the performance and value of these integrated efforts, and to use the evaluation results to refine and improve the surveillance of AMR and AMU globally.
Beyond Correlation versus Causation: Multi-brain Neuroscience Needs Explanation
Quentin Moreau