Explainability and Interpretability: Keys to Deep Medicine
Arash Shaban-Nejad
Martin Michalowski
A Study of Policy Gradient on a Class of Exactly Solvable Models
Gavin McCracken
Colin Daniels
Rosie Zhao
Anna M. Brandenberger
Policy gradient methods are extensively used in reinforcement learning as a way to optimize expected return. In this paper, we explore the e… (see more)volution of the policy parameters, for a special class of exactly solvable POMDPs, as a continuous-state Markov chain, whose transition probabilities are determined by the gradient of the distribution of the policy's value. Our approach relies heavily on random walk theory, specifically on affine Weyl groups. We construct a class of novel partially observable environments with controllable exploration difficulty, in which the value distribution, and hence the policy parameter evolution, can be derived analytically. Using these environments, we analyze the probabilistic convergence of policy gradient to different local maxima of the value function. To our knowledge, this is the first approach developed to analytically compute the landscape of policy gradient in POMDPs for a class of such environments, leading to interesting insights into the difficulty of this problem.
Urban Night Scenery Reconstruction by Day-night Registration and Synthesis
Andi Dai
Bisimulation metrics and norms for real-weighted automata
Borja Balle
Pascale Gourdeau
ComplexDataLab at W-NUT 2020 Task 2: Detecting Informative COVID-19 Tweets by Attending over Linked Documents
Kellin Pelrine
Jacob Danovitch
Albert Orozco Camacho
Given the global scale of COVID-19 and the flood of social media content related to it, how can we find informative discussions? We present … (see more)Gapformer, which effectively classifies content as informative or not. It reformulates the problem as graph classification, drawing on not only the tweet but connected webpages and entities. We leverage a pre-trained language model as well as the connections between nodes to learn a pooled representation for each document network. We show it outperforms several competitive baselines and present ablation studies supporting the benefit of the linked information. Code is available on Github.
Deconstructing Word Embedding Algorithms
Kian Kenyon-Dean
Edward Daniel Newell
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse D. Thomason
Jacob Andreas
Joyce Yue Chai
Mirella Lapata
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph Turian
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
Jonathan Pilault
Raymond Li
Sandeep Subramanian
We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarizati… (see more)on. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher ROUGE scores. We provide extensive comparisons with strong baseline methods, prior state of the art work as well as multiple variants of our approach including those using only transformers, only extractive techniques and combinations of the two. We examine these models using four different summarization tasks and datasets: arXiv papers, PubMed papers, the Newsroom and BigPatent datasets. We find that transformer based methods produce summaries with fewer n-gram copies, leading to n-gram copying statistics that are more similar to human generated abstracts. We include a human evaluation, finding that transformers are ranked highly for coherence and fluency, but purely extractive methods score higher for informativeness and relevance. We hope that these architectures and experiments may serve as strong points of comparison for future work. Note: The abstract above was collaboratively written by the authors and one of the models presented in this paper based on an earlier draft of this paper.
Multi-Fact Correction in Abstractive Text Summarization
Yue Dong
Shuohang Wang
Zhe Gan
Yu Cheng
Jingjing Liu
Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles
Yao Lu
Yue Dong
Multi-document summarization is a challenging task for which there exists little large-scale datasets. We propose Multi-XScience, a large-sc… (see more)ale multi-document summarization dataset created from scientific articles. Multi-XScience introduces a challenging multi-document summarization task: writing the related-work section of a paper based on its abstract and the articles it references. Our work is inspired by extreme summarization, a dataset construction protocol that favours abstractive modeling approaches. Descriptive statistics and empirical results—using several state-of-the-art models trained on the Multi-XScience dataset—reveal that Multi-XScience is well suited for abstractive models.
Recursive Top-Down Production for Sentence Generation with Latent Trees
Supervised Seeded Iterated Learning for Interactive Language Learning
Yuchen Lu
Soumye Singhal
Florian Strub
Olivier Pietquin