Publications

Diversity-Enriched Option-Critic

Anand Kamat

Temporal abstraction allows reinforcement learning agents to represent knowledge and develop strategies over different temporal scales. The … (see more)option-critic framework has been demonstrated to learn temporally extended actions, represented as options, end-to-end in a model-free setting. However, feasibility of option-critic remains limited due to two major challenges, multiple options adopting very similar behavior, or a shrinking set of task relevant options. These occurrences not only void the need for temporal abstraction, they also affect performance. In this paper, we tackle these problems by learning a diverse set of options. We introduce an information-theoretic intrinsic reward, which augments the task reward, as well as a novel termination objective, in order to encourage behavioral diversity in the option set. We show empirically that our proposed method is capable of learning options end-to-end on several discrete and continuous control tasks, outperforms option-critic by a wide margin. Furthermore, we show that our approach sustainably generates robust, reusable, reliable and interpretable options, in contrast to option-critic.

2020-11-04

ArXiv (preprint)

arxiv.org

Effectiveness of quarantine and testing to prevent COVID-19 transmission from arriving travelers

Russell Wa

David Buckeridge

2020-11-04

medRxiv (preprint)

doi.org

Explainability and Interpretability: Keys to Deep Medicine

Arash Shaban-Nejad

Martin Michalowski

David Buckeridge

2020-11-03

Explainable AI in Healthcare and Medicine (published)

doi.org

A Study of Policy Gradient on a Class of Exactly Solvable Models

Gavin McCracken

Colin Daniels

Rosie Zhao

Anna M. Brandenberger

Prakash Panangaden

Doina Precup

Policy gradient methods are extensively used in reinforcement learning as a way to optimize expected return. In this paper, we explore the e… (see more)volution of the policy parameters, for a special class of exactly solvable POMDPs, as a continuous-state Markov chain, whose transition probabilities are determined by the gradient of the distribution of the policy's value. Our approach relies heavily on random walk theory, specifically on affine Weyl groups. We construct a class of novel partially observable environments with controllable exploration difficulty, in which the value distribution, and hence the policy parameter evolution, can be derived analytically. Using these environments, we analyze the probabilistic convergence of policy gradient to different local maxima of the value function. To our knowledge, this is the first approach developed to analytically compute the landscape of policy gradient in POMDPs for a class of such environments, leading to interesting insights into the difficulty of this problem.

2020-11-03

ArXiv (preprint)

arxiv.org

Bisimulation metrics and norms for real-weighted automata

Borja Balle

Pascale Gourdeau

Prakash Panangaden

2020-11-01

Information and Computation (published)

doi.org

ComplexDataLab at W-NUT 2020 Task 2: Detecting Informative COVID-19 Tweets by Attending over Linked Documents

Kellin Pelrine

Jacob Danovitch

Albert Orozco Camacho

Reihaneh Rabbany

Given the global scale of COVID-19 and the flood of social media content related to it, how can we find informative discussions? We present … (see more)Gapformer, which effectively classifies content as informative or not. It reformulates the problem as graph classification, drawing on not only the tweet but connected webpages and entities. We leverage a pre-trained language model as well as the connections between nodes to learn a pooled representation for each document network. We show it outperforms several competitive baselines and present ablation studies supporting the benefit of the linked information. Code is available on Github.

2020-11-01

WNUT (published)

doi.org

Deconstructing Word Embedding Algorithms

Kian Kenyon-Dean

Edward Daniel Newell

Jackie Cheung

2020-11-01

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (published)

doi.org

arxiv.org

Experience Grounds Language

Yonatan Bisk

Ari Holtzman

Jesse D. Thomason

Jacob Andreas

Yoshua Bengio

Joyce Yue Chai

Mirella Lapata

Angeliki Lazaridou

Jonathan May

Aleksandr Nisnevich

Nicolas Pinto

Joseph Turian

2020-11-01

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (published)

doi.org

arxiv.org

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

Jonathan Pilault

Raymond Li

Sandeep Subramanian

Chris Pal

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarizati… (see more)on. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher ROUGE scores. We provide extensive comparisons with strong baseline methods, prior state of the art work as well as multiple variants of our approach including those using only transformers, only extractive techniques and combinations of the two. We examine these models using four different summarization tasks and datasets: arXiv papers, PubMed papers, the Newsroom and BigPatent datasets. We find that transformer based methods produce summaries with fewer n-gram copies, leading to n-gram copying statistics that are more similar to human generated abstracts. We include a human evaluation, finding that transformers are ranked highly for coherence and fluency, but purely extractive methods score higher for informativeness and relevance. We hope that these architectures and experiments may serve as strong points of comparison for future work. Note: The abstract above was collaboratively written by the authors and one of the models presented in this paper based on an earlier draft of this paper.

2020-11-01

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (published)

doi.org

arxiv.org

Factual Error Correction for Abstractive Summarization Models

Meng Cao

Yue Dong

Jiapeng Wu

Jackie Cheung

Neural abstractive summarization systems have achieved promising progress, thanks to the availability of large-scale datasets and models pre… (see more)-trained with self-supervised methods. However, ensuring the factual consistency of the generated summaries for abstractive summarization systems is a challenge. We propose a post-editing corrector module to address this issue by identifying and correcting factual errors in generated summaries. The neural corrector model is pre-trained on artificial examples that are created by applying a series of heuristic transformations on reference summaries. These transformations are inspired by an error analysis of state-of-the-art summarization model outputs. Experimental results show that our model is able to correct factual errors in summaries generated by other neural summarization models and outperforms previous models on factual consistency evaluation on the CNN/DailyMail dataset. We also find that transferring from artificial error correction to downstream settings is still very challenging.

2020-11-01

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (published)

doi.org

arxiv.org

MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining

Zhi Wen

Xing Han Lu

Siva Reddy

2020-11-01

Proceedings of the 3rd Clinical Natural Language Processing Workshop (published)

doi.org

arxiv.org