Publications

Summarizing Societies: Agent Abstraction in Multi-Agent Reinforcement Learning

Matthew D Riemer

Agents cannot make sense of many-agent societies through direct consideration of small-scale, low-level agent identities, but instead must r… (see more)ecognize emergent collective identities. Here, we take a first step towards a framework for recognizing this structure in large groups of low-level agents so that they can be modeled as a much smaller number of high-level agents—a process that we call agent abstraction. We illustrate this process by extending bisimulation metrics for state abstraction in reinforcement learning to the setting of multi-agent reinforcement learning and analyze a straightforward, if crude, abstraction based on experienced joint actions. It addresses non-stationarity due to other learning agents by improving minimax regret by a intuitive factor. To test if this compression factor provides signal for higher-level agency, we applied it to a large dataset of human play of the popular social dilemma game Diplomacy. We find that it correlates strongly with the degree of ground-truth abstraction of low-level units into the human players.

2021-12-31

(published)

openreview.net

A Synchro-Set-Aided Breadth-First Sphere Decoder for Polar-Coded MIMO Systems

Huayi Zhou

Xiangyun Deng

Yiqian Cai

Yifei Shen

Minhua Yang

Warren J. Gross

Xiaohu You

Chuan Zhang

The joint optimization of multiple-input-multiple-output (MIMO) detection and polar decoding has become a research hotspot for future commun… (see more)ication systems. The error-correction performance of the separate detection and decoding (SDD) is far from the Shannon capacity, which cannot meet the requirements of communication scenarios such as ultra-reliable and low latency communications (URLLC). The existing joint detection and decoding (JDD) using breadth-first sphere decoding (BFSD) improves the reliability over SDD but still has a huge performance loss on low-rate codes. In this paper, JDD using synchro-set-aided BFSD (SA-BFSD) is proposed to greatly improve the error-correction performance for polar-coded MIMO systems. We first propose a method to generate the symbol synchro sets through the concept of frozen symbols, then refine the symbol synchro sets based on the characteristics analysis of the channel matrix. We optimize the enumerating order of the symbols and reduce the enumerating levels. The frame error rate (FER) and the bit error rate of the proposed algorithms are significantly improved especially for the low-rate codes. The proposed SA-BFSD JDD achieves an up to 7.8 dB performance gain over BFSD at FER

2021-12-31

IEEE Transactions on Signal Processing (published)

doi.org

TACTiS: Transformer-Attentional Copulas for Time Series

Alexandre Drouin

Étienne Marcotte

Nicolas Chapados

The estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. However, t… (see more)he practical utility of such estimates is limited by how accurately they quantify predictive uncertainty. In this work, we address the problem of estimating the joint predictive distribution of high-dimensional multivariate time series. We propose a versatile method, based on the transformer architecture, that estimates joint distributions using an attention-based decoder that provably learns to mimic the properties of non-parametric copulas. The resulting model has several desirable properties: it can scale to hundreds of time series, supports both forecasting and interpolation, can handle unaligned and non-uniformly sampled data, and can seamlessly adapt to missing data during training. We demonstrate these properties empirically and show that our model produces state-of-the-art predictions on multiple real-world datasets.

2021-12-31

ICML (published)

proceedings.mlr.press

TaHiD: Tackling Data Hiding in Fake News Detection with News Propagation Networks

Adrien Benamira

Benjamin Devillers

Etienne Lesot

Ayush K. Ray

Manal Saadi

Fragkiskos D 587

Steven Bird

Ewan Klein

Edward Loper

Nat-593

Carlos Castillo

Marcelo Mendoza

Barbara Poblete

Daryna Dementieva

Alexander Panchenko

Jacob Devlin

Ming-Wei Chang

Kenton Lee

Ashish Vaswani

Noam M. Shazeer … (see 8 more)

Niki Parmar

Adriana Romero

Pietro Lio’

Yoshua Bengio

Yaqing Wang

Fenglong Ma

Zhiwei Jin

Ye Yuan

Fake news with detrimental societal effects has 001 attracted extensive attention and research. De-002 spite early success, the state-of-the… (see more)-art meth-003 ods fall short of considering the propagation 004 of news. News propagates at different times 005 through different mediums, including users, 006 comments, and sources, which form the news 007 propagation network. Moreover, the serious 008 problem of data hiding arises, which means 009 that fake news publishers disguise fake news 010 as real to confuse users by deleting comments 011 that refute the rumor or deleting the news itself 012 when it has been spread widely. Existing meth-013 ods do not consider the propagation of news 014 and fail to identify what matters in the process, 015 which leads to fake news hiding in the prop-016 agation network and escaping from detection. 017 Inspired by the propagation of news, we pro-018 pose a novel fake news detection framework 019 named TaHiD, which models the propagation 020 as a heterogeneous dynamic graph and contains 021 the propagation attention module to measure 022 the influence of different propagation. Exper-023 iments demonstrate that TaHiD extracts use-024 ful information from the news propagation net-025 work and outperforms state-of-the-art methods 026 on several benchmark datasets for fake news 027 detection. Additional studies also show that 028 TaHiD is capable of identifying fake news in 029 the case of data hiding. 030

2021-12-31

(published)

www.semanticscholar.org

Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline

Massimo Caccia

Jonas Mueller

Taesup Kim

Laurent Charlin

Rasool Fakoor

We study task-agnostic continual reinforcement learning (TACRL) in which standard RL challenges are compounded with partial observability st… (see more)emming from task agnosticism, as well as additional difﬁculties of continual learning (CL), i.e., learning on a non-stationary sequence of tasks. Here we compare TACRL methods with their soft upper bounds prescribed by previous literature: multi-task learning (MTL) methods which do not have to deal with non-stationary data distributions, as well as task-aware methods, which are allowed to operate under full observability . We consider a previously unexplored and straightforward baseline for TACRL, replay-based recurrent RL (3RL), in which we augment an RL algorithm with recurrent mechanisms to address partial observability and experience replay mechanisms to address catastrophic forgetting in CL. Studying empirical performance in a sequence of RL tasks, we ﬁnd surprising occurrences of 3RL matching and overcoming the MTL and task-aware soft upper bounds. We lay out hypotheses that could explain this inﬂection point of continual and task-agnostic learning research. Our hypotheses are empirically tested in continuous control tasks via a large-scale study of the popular multi-task and continual learning benchmark Meta-World. By analyzing different training statistics including gradient conﬂict, we ﬁnd evidence that 3RL’s outperformance stems from its ability to quickly infer how new tasks relate with the previous ones, enabling forward transfer.

2021-12-31

arXiv.org (preprint)

doi.org

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

Aniket Didolkar

Kshitij Gupta

Anirudh Goyal

Nitesh B. Gundavarapu

Alex Lamb

Nan Rosemary Ke

Yoshua Bengio

Recurrent neural networks have a strong inductive bias towards learning temporally compressed representations, as the entire history of a se… (see more)quence is represented by a single vector. By contrast, Transformers have little inductive bias towards learning temporally compressed representations, as they allow for attention over all previously computed elements in a sequence. Having a more compressed representation of a sequence may be beneficial for generalization, as a high-level representation may be more easily re-used and re-purposed and will contain fewer irrelevant details. At the same time, excessive compression of representations comes at the cost of expressiveness. We propose a solution which divides computation into two streams. A slow stream that is recurrent in nature aims to learn a specialized and compressed representation, by forcing chunks of

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

doi.org

openreview.net

On the benefits of representation regularization in invariance based domain generalization

Changjian Shui

Boyu Wang

Christian Gagné

A crucial aspect of reliable machine learning is to design a deployable system for generalizing new related but unobserved environments. Dom… (see more)ain generalization aims to alleviate such a prediction gap between the observed and unseen environments. Previous approaches commonly incorporated learning the invariant representation for achieving good empirical performance. In this paper, we reveal that merely learning the invariant representation is vulnerable to the related unseen environment. To this end, we derive a novel theoretical analysis to control the unseen test environment error in the representation learning, which highlights the importance of controlling the smoothness of representation. In practice, our analysis further inspires an efficient regularization method to improve the robustness in domain generalization. The proposed regularization is orthogonal to and can be straightforwardly adopted in existing domain generalization algorithms that ensure invariant representation learning. Empirical results show that our algorithm outperforms the base versions in various datasets and invariance criteria.

2021-12-31

Machine Learning (published)

doi.org

arxiv.org

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Hugo Laurençon

Lucile Saulnier

Thomas Wang

Christopher Akiki

Albert Villanova del Moral

Teven Le Scao

Leandro Von Werra

Chenghao Mou

Eduardo González Ponferrada

Huu Nguyen

Jörg Frohberg

Mario Šaško

Quentin Lhoest

Angelina McMillan-Major

Gérard Dupont

Stella Biderman

Anna Rogers

Loubna Ben allal

Francesco De Toni

Giada Pistilli … (see 34 more)

Olivier Nguyen

Somaieh Nikpoor

Maraim Masoud

Pierre Colombo

Javier de la Rosa

Paulo Villegas

Tristan Thrush

Shayne Longpre

Sebastian Nagel

Leon Weber

Manuel Romero Muñoz

Jian Zhu

Daniel Van Strien

Zaid Alyafeai

Khalid Almubarak

Vu Minh Chien

Itziar Gonzalez-Dios

Aitor Soroa

Kyle Lo

Manan Dey

Pedro Ortiz Suarez

Aaron Gokaslan

Shamik Bose

David Ifeoluwa Adelani

Long Phan

Hieu Tran

Ian Yu

Suhas Pai

Jenny Chim

Violette Lepercq

Suzana Ilic

Margaret Mitchell

Sasha Luccioni

Yacine Jernite

As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multili… (see more)ngual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed with the goal of researching and training large language models as a values-driven undertaking, putting issues of ethics, harm, and governance in the foreground. This paper documents the data creation and curation efforts undertaken by BigScience to assemble the Responsible Open-science Open-collaboration Text Sources (ROOTS) corpus, a 1.6TB dataset spanning 59 languages that was used to train the 176-billion-parameter BigScience Large Open-science Open-access Multilingual (BLOOM) language model. We further release a large initial subset of the corpus and analyses thereof, and hope to empower large-scale monolingual and multilingual modeling projects with both the data and the processing tools, as well as stimulate research around this large multilingual corpus.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

doi.org

openreview.net

On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging

Chris Junchi Li

Yaodong Yu

Nicolas Loizou

Gauthier Gidel

Yitong Ma

Nicolas Roux

Michael I. Jordan

We study the stochastic bilinear minimax optimization problem, presenting an analysis of the same-sample Stochastic ExtraGradient (SEG) meth… (see more)od with constant step size, and presenting variations of the method that yield favorable convergence. In sharp contrasts with the basic SEG method whose last iterate only contracts to a fixed neighborhood of the Nash equilibrium, SEG augmented with iteration averaging provably converges to the Nash equilibrium under the same standard settings, and such a rate is further improved by incorporating a scheduled restarting procedure. In the interpolation setting where noise vanishes at the Nash equilibrium, we achieve an optimal convergence rate up to tight constants. We present numerical experiments that validate our theoretical findings and demonstrate the effectiveness of the SEG method when equipped with iteration averaging and restarting.

2021-12-31

AISTATS (published)

arxiv.org

The Curious Case of Absolute Position Embeddings

Koustuv Sinha

Amirhossein Kazemnejad

Siva Reddy

Joelle Pineau

Dieuwke Hupkes

Adina Williams

Transformer language models encode the notion of word order using positional information. Most commonly, this positional information is repr… (see more)esented by absolute position embeddings (APEs), that are learned from the pretraining data. However, in natural language, it is not absolute position that matters, but relative position, and the extent to which APEs can capture this type of information has not been investigated. In this work, we observe that models trained with APE over-rely on positional information to the point that they break-down when subjected to sentences with shifted position information. Specifically, when models are subjected to sentences starting from a non-zero position (excluding the effect of priming), they exhibit noticeably degraded performance on zero to full-shot tasks, across a range of model families and model sizes. Our findings raise questions about the efficacy of APEs to model the relativity of position information, and invite further introspection on the sentence and word order processing strategies employed by these models.

2021-12-31

EMNLP (Findings) (published)

doi.org

arxiv.org

On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?

Nouha Dziri

Sivan Milton

Mo Yu

Osmar Zaiane

Siva Reddy

Knowledge-grounded conversational models are known to suffer from producing factually invalid statements, a phenomenon commonly called hallu… (see more)cination. In this work, we investigate the underlying causes of this phenomenon: is hallucination due to the training data, or to the models? We conduct a comprehensive human study on both existing knowledge-grounded conversational benchmarks and several state-of-the-art models. Our study reveals that the standard benchmarks consist of >60% hallucinated responses, leading to models that not only hallucinate but even amplify hallucinations. Our findings raise important questions on the quality of existing datasets and models trained using them. We make our annotations publicly available for future research.

2021-12-31

arXiv (preprint)

doi.org

arxiv.org

The Role of Robotics in Achieving the United Nations Sustainable Development Goals - The Experts' Meeting at the 2021 IEEE/RSJ IROS Workshop [Industry Activities].

Vincent Mai

Bram Vanderborght

Tamás Haidegger

Alaa M. Khamis

Niraj Bhargava

Dominik B. O. Boesl

Katleen Gabriels

An Jacobs

AJung Moon

Robin R. Murphy

Yasushi Nakauchi

Edson Prestes

Bram Vanderborght

Ricardo Vinuesa

Carl-Maria Mörch

2021-12-31

IEEE Robotics Autom. Mag. (published)

doi.org

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Publications

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Popular keywords:

Publications