Publications

Robust Policy Learning over Multiple Uncertainty Sets
Annie Xie
Shagun Sodhani
Chelsea Finn
Amy Zhang
Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments. While system identification methods prov… (voir plus)ide a way to infer the variation from online experience, they can fail in settings where fast identification is not possible. Another dominant approach is robust RL which produces a policy that can handle worst-case scenarios, but these methods are generally designed to achieve robustness to a single uncertainty set that must be specified at train time. Towards a more general solution, we formulate the multi-set robustness problem to learn a policy robust to different perturbation sets. We then design an algorithm that enjoys the benefits of both system identification and robust RL: it reduces uncertainty where possible given a few interactions, but can still act robustly with respect to the remaining uncertainty. On a diverse set of control tasks, our approach demonstrates improved worst-case performance on new environments compared to prior methods based on system identification and on robust RL alone.
Robustness of Whittle Index Policy to Model Approximation
Amit Sinha
Scalable Operator Allocation for Multirobot Assistance: A Restless Bandit Approach
Abhinav Dahiya
Nima Akbarzadeh
Stephen L. Smith
In this article, we consider the problem of allocating human operators in a system with multiple semiautonomous robots. Each robot is requir… (voir plus)ed to perform an independent sequence of tasks, subject to a chance of failing and getting stuck in a fault state at every task. If and when required, a human operator can assist or teleoperate a robot. Conventional dynamic programming-based techniques used to solve such problems face scalability issues due to an exponential growth of state and action spaces with the number of robots and operators. In this article, we derive conditions under which the operator allocation problem satisfies a technical condition called indexability, thereby enabling the use of the Whittle index heuristic. The conditions are easy to check, and we show that they hold for a wide range of problems of interest. Our key insight is to leverage the structure of the value function of individual robots, resulting in conditions that can be verified separately for each state of each robot. We apply these conditions to two types of transitions commonly seen in remote robot supervision systems. Through numerical simulations, we demonstrate the efficacy of Whittle index policy as a near-optimal and scalable approach that outperforms existing scalable methods.
Scaling the Number of Tasks in Continual Learning
Timothee LESORT
Oleksiy Ostapenko
Diganta Misra
Md Rifat Arefin
Pau Rodriguez
Sociotechnical Harms: Scoping a Taxonomy for Harm Reduction
Renee Shelby
Shalaleh Rismani
Kathryn Henne
Paul Nicholas
N'mah Fodiatu Yilla
Jess Gallegos
Andrew J Smart
Emilio Garcia
Gurleen Virk
Source-summary Entity Aggregation in Abstractive Summarization.
José-ángel González
Annie Priyadarshini Louis
A Synchro-Set-Aided Breadth-First Sphere Decoder for Polar-Coded MIMO Systems
Huayi Zhou
Xiangyun Deng
Yiqian Cai
Yifei Shen
Minhua Yang
Xiaohu You
Chuan Zhang
The joint optimization of multiple-input-multiple-output (MIMO) detection and polar decoding has become a research hotspot for future commun… (voir plus)ication systems. The error-correction performance of the separate detection and decoding (SDD) is far from the Shannon capacity, which cannot meet the requirements of communication scenarios such as ultra-reliable and low latency communications (URLLC). The existing joint detection and decoding (JDD) using breadth-first sphere decoding (BFSD) improves the reliability over SDD but still has a huge performance loss on low-rate codes. In this paper, JDD using synchro-set-aided BFSD (SA-BFSD) is proposed to greatly improve the error-correction performance for polar-coded MIMO systems. We first propose a method to generate the symbol synchro sets through the concept of frozen symbols, then refine the symbol synchro sets based on the characteristics analysis of the channel matrix. We optimize the enumerating order of the symbols and reduce the enumerating levels. The frame error rate (FER) and the bit error rate of the proposed algorithms are significantly improved especially for the low-rate codes. The proposed SA-BFSD JDD achieves an up to 7.8 dB performance gain over BFSD at FER
A Synchro-Set-Aided Breadth-First Sphere Decoder for Polar-Coded MIMO Systems
Huayi Zhou
Xiangyun Deng
Yiqian Cai
Yifei Shen
Minhua Yang
X. You
Chuan Zhang
The joint optimization of multiple-input-multiple-output (MIMO) detection and polar decoding has become a research hotspot for future commun… (voir plus)ication systems. The error-correction performance of the separate detection and decoding (SDD) is far from the Shannon capacity, which cannot meet the requirements of communication scenarios such as ultra-reliable and low latency communications (URLLC). The existing joint detection and decoding (JDD) using breadth-first sphere decoding (BFSD) improves the reliability over SDD but still has a huge performance loss on low-rate codes. In this paper, JDD using synchro-set-aided BFSD (SA-BFSD) is proposed to greatly improve the error-correction performance for polar-coded MIMO systems. We first propose a method to generate the symbol synchro sets through the concept of frozen symbols, then refine the symbol synchro sets based on the characteristics analysis of the channel matrix. We optimize the enumerating order of the symbols and reduce the enumerating levels. The frame error rate (FER) and the bit error rate of the proposed algorithms are significantly improved especially for the low-rate codes. The proposed SA-BFSD JDD achieves an up to 7.8 dB performance gain over BFSD at FER
TACTiS: Transformer-Attentional Copulas for Time Series
The estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. However, t… (voir plus)he practical utility of such estimates is limited by how accurately they quantify predictive uncertainty. In this work, we address the problem of estimating the joint predictive distribution of high-dimensional multivariate time series. We propose a versatile method, based on the transformer architecture, that estimates joint distributions using an attention-based decoder that provably learns to mimic the properties of non-parametric copulas. The resulting model has several desirable properties: it can scale to hundreds of time series, supports both forecasting and interpolation, can handle unaligned and non-uniformly sampled data, and can seamlessly adapt to missing data during training. We demonstrate these properties empirically and show that our model produces state-of-the-art predictions on multiple real-world datasets.
Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline
Massimo Caccia
Jonas Mueller
Taesup Kim
Rasool Fakoor
We study task-agnostic continual reinforcement learning (TACRL) in which standard RL challenges are compounded with partial observability st… (voir plus)emming from task agnosticism, as well as additional difficulties of continual learning (CL), i.e., learning on a non-stationary sequence of tasks. Here we compare TACRL methods with their soft upper bounds prescribed by previous literature: multi-task learning (MTL) methods which do not have to deal with non-stationary data distributions, as well as task-aware methods, which are allowed to operate under full observability . We consider a previously unexplored and straightforward baseline for TACRL, replay-based recurrent RL (3RL), in which we augment an RL algorithm with recurrent mechanisms to address partial observability and experience replay mechanisms to address catastrophic forgetting in CL. Studying empirical performance in a sequence of RL tasks, we find surprising occurrences of 3RL matching and overcoming the MTL and task-aware soft upper bounds. We lay out hypotheses that could explain this inflection point of continual and task-agnostic learning research. Our hypotheses are empirically tested in continuous control tasks via a large-scale study of the popular multi-task and continual learning benchmark Meta-World. By analyzing different training statistics including gradient conflict, we find evidence that 3RL’s outperformance stems from its ability to quickly infer how new tasks relate with the previous ones, enabling forward transfer.
On the benefits of representation regularization in invariance based domain generalization
Changjian Shui
Boyu Wang
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Hugo Laurençon
Lucile Saulnier
Thomas Wang
Christopher Akiki
Albert Villanova del Moral
Teven Le Scao
Leandro Von Werra
Chenghao Mou
Eduardo González Ponferrada
Huu Nguyen
Jörg Frohberg
Mario Šaško
Quentin Lhoest
Angelina McMillan-Major
Gérard Dupont
Stella Biderman
Anna Rogers
Loubna Ben allal
Francesco De Toni
Giada Pistilli … (voir 34 de plus)
Olivier Nguyen
Somaieh Nikpoor
Maraim Masoud
Pierre Colombo
Javier de la Rosa
Paulo Villegas
Tristan Thrush
Shayne Longpre
Sebastian Nagel
Leon Weber
Manuel Romero Muñoz
Jian Zhu
Daniel Van Strien
Zaid Alyafeai
Khalid Almubarak
Vu Minh Chien
Itziar Gonzalez-Dios
Aitor Soroa
Kyle Lo
Manan Dey
Pedro Ortiz Suarez
Aaron Gokaslan
Shamik Bose
Long Phan
Hieu Tran
Ian Yu
Suhas Pai
Jenny Chim
Violette Lepercq
Suzana Ilic
Margaret Mitchell
Sasha Luccioni
Yacine Jernite
As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multili… (voir plus)ngual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed with the goal of researching and training large language models as a values-driven undertaking, putting issues of ethics, harm, and governance in the foreground. This paper documents the data creation and curation efforts undertaken by BigScience to assemble the Responsible Open-science Open-collaboration Text Sources (ROOTS) corpus, a 1.6TB dataset spanning 59 languages that was used to train the 176-billion-parameter BigScience Large Open-science Open-access Multilingual (BLOOM) language model. We further release a large initial subset of the corpus and analyses thereof, and hope to empower large-scale monolingual and multilingual modeling projects with both the data and the processing tools, as well as stimulate research around this large multilingual corpus.