Publications

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

Raymond Li

Christopher Pal

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarizati… (voir plus)on. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper.

2020-10-31

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (publié)

doi.org

arxiv.org

Factual Error Correction for Abstractive Summarization Models

Meng Cao

Yue Dong

Jiapeng Wu

Jackie Chi Kit Cheung

Neural abstractive summarization systems have achieved promising progress, thanks to the availability of large-scale datasets and models pre… (voir plus)-trained with self-supervised methods. However, ensuring the factual consistency of the generated summaries for abstractive summarization systems is a challenge. We propose a post-editing corrector module to address this issue by identifying and correcting factual errors in generated summaries. The neural corrector model is pre-trained on artificial examples that are created by applying a series of heuristic transformations on reference summaries. These transformations are inspired by an error analysis of state-of-the-art summarization model outputs. Experimental results show that our model is able to correct factual errors in summaries generated by other neural summarization models and outperforms previous models on factual consistency evaluation on the CNN/DailyMail dataset. We also find that transferring from artificial error correction to downstream settings is still very challenging.

2020-10-31

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (publié)

doi.org

arxiv.org

Inference and Prediction Diverge in Biomedicine

Danilo Bzdok

Denis Engemann

Bertrand Thirion

In the 20th century, many advances in biological knowledge and evidence-based medicine were supported by p values and accompanying methods. … (voir plus)In the early 21st century, ambitions toward precision medicine place a premium on detailed predictions for single individuals. The shift causes tension between traditional regression methods used to infer statistically significant group differences and burgeoning predictive analysis tools suited to forecast an individual's future. Our comparison applies linear models for identifying significant contributing variables and for finding the most predictive variable sets. In systematic data simulations and common medical datasets, we explored how variables identified as significantly relevant and variables identified as predictively relevant can agree or diverge. Across analysis scenarios, even small predictive performances typically coincided with finding underlying significant statistical relationships, but not vice versa. More complete understanding of different ways to define “important” associations is a prerequisite for reproducible research and advances toward personalizing medical care.

2020-10-31

Patterns (publié)

doi.org

Multi-Fact Correction in Abstractive Text Summarization

Yue Dong

Shuohang Wang

Zhe Gan

Yu Cheng

Jackie CK Cheung

Jingjing Liu

2020-10-31

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (publié)

doi.org

arxiv.org

Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

Yao Lu

Yue Dong

Laurent Charlin

Multi-document summarization is a challenging task for which there exists little large-scale datasets. We propose Multi-XScience, a large-sc… (voir plus)ale multi-document summarization dataset created from scientific articles. Multi-XScience introduces a challenging multi-document summarization task: writing the related-work section of a paper based on its abstract and the articles it references. Our work is inspired by extreme summarization, a dataset construction protocol that favours abstractive modeling approaches. Descriptive statistics and empirical results—using several state-of-the-art models trained on the Multi-XScience dataset—reveal that Multi-XScience is well suited for abstractive models.

2020-10-31

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (publié)

doi.org

arxiv.org

Recursive Top-Down Production for Sentence Generation with Latent Trees

Timothy J. O'Donnell

We model the recursive production property of context-free grammars for natural and synthetic languages. To this end, we present a dynamic p… (voir plus)rogramming algorithm that marginalises over latent binary tree structures with

2020-10-31

Findings of the Association for Computational Linguistics: EMNLP 2020 (publié)

doi.org

arxiv.org

Supervised Seeded Iterated Learning for Interactive Language Learning

Yuchen Lu

Soumye Singhal

Florian Strub

Olivier Pietquin

Aaron Courville

2020-10-31

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (publié)

doi.org

arxiv.org

TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion

Jiapeng Wu

Meng Cao

Jackie Chi Kit Cheung

William L. Hamilton

Inferring missing facts in temporal knowledge graphs (TKGs) is a fundamental and challenging task. Previous works have approached this probl… (voir plus)em by augmenting methods for static knowledge graphs to leverage time-dependent representations. However, these methods do not explicitly leverage multi-hop structural information and temporal facts from recent time steps to enhance their predictions. Additionally, prior work does not explicitly address the temporal sparsity and variability of entity distributions in TKGs. We propose the Temporal Message Passing (TeMP) framework to address these challenges by combining graph neural networks, temporal dynamics models, data imputation and frequency-based gating techniques. Experiments on standard TKG tasks show that our approach provides substantial gains compared to the previous state of the art, achieving a 10.7% average relative improvement in Hits@10 across three standard benchmarks. Our analysis also reveals important sources of variability both within and across TKG datasets, and we introduce several simple but strong baselines that outperform the prior state of the art in certain settings.

2020-10-31

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (publié)

doi.org

arxiv.org

TESA: A Task in Entity Semantic Aggregation for Abstractive Summarization

Clement Jumel

Annie Priyadarshini Louis

Jackie CK Cheung

Human-written texts contain frequent generalizations and semantic aggregation of content. In a document, they may refer to a pair of named e… (voir plus)ntities such as ‘London’ and ‘Paris’ with different expressions: “the major cities”, “the capital cities” and “two European cities”. Yet generation, especially, abstractive summarization systems have so far focused heavily on paraphrasing and simplifying the source content, to the exclusion of such semantic abstraction capabilities. In this paper, we present a new dataset and task aimed at the semantic aggregation of entities. TESA contains a dataset of 5.3K crowd-sourced entity aggregations of Person, Organization, and Location named entities. The aggregations are document-appropriate, meaning that they are produced by annotators to match the situational context of a given news article from the New York Times. We then build baseline models for generating aggregations given a tuple of entities and document context. We finetune on TESA an encoder-decoder language model and compare it with simpler classification methods based on linguistically informed features. Our quantitative and qualitative evaluations show reasonable performance in making a choice from a given list of expressions, but free-form expressions are understandably harder to generate and evaluate.

2020-10-31

Conference on Empirical Methods in Natural Language Processing (publié)

doi.org

Neuroimaging: into the Multiverse

Jessica Dafflon

Pedro F. da Costa

František Váša

Ricardo Pio Monti

Danilo Bzdok

Peter J. Hellyer

Federico Turkheimer

Jonathan Smallwood

Emily J. H. Jones

Robert Leech

For most neuroimaging questions the huge range of possible analytic choices leads to the possibility that conclusions from any single analyt… (voir plus)ic approach may be misleading. Examples of possible choices include the motion regression approach used and smoothing and threshold factors applied during the processing pipeline. Although it is possible to perform a multiverse analysis that evaluates all possible analytic choices, this can be computationally challenging and repeated sequential analyses on the same data can compromise inferential and predictive power. Here, we establish how active learning on a low-dimensional space that captures the inter-relationships between analysis approaches can be used to efficiently approximate the whole multiverse of analyses. This approach balances the benefits of a multiverse analysis without the accompanying cost to statistical power, computational power and the integrity of inferences. We illustrate this approach with a functional MRI dataset of functional connectivity across adolescence, demonstrating how a multiverse of graph theoretic and simple pre-processing steps can be efficiently navigated using active learning. Our study shows how this approach can identify the subset of analysis techniques (i.e., pipelines) which are best able to predict participants’ ages, as well as allowing the performance of different approaches to be quantified.

2020-10-28

bioRxiv (prépublication)

doi.org

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach

Iulian Vlad Serban

Chinnadhurai Sankar

Michael Pieper

Joelle Pineau

Yoshua Bengio

Deep reinforcement learning has recently shown many impressive successes. However, one major obstacle towards applying such methods to real-… (voir plus)world problems is their lack of data-efficiency. To this end, we propose the Bottleneck Simulator: a model-based reinforcement learning method which combines a learned, factorized transition model of the environment with rollout simulations to learn an effective policy from few examples. The learned transition model employs an abstract, discrete (bottleneck) state, which increases sample efficiency by reducing the number of model parameters and by exploiting structural properties of the environment. We provide a mathematical analysis of the Bottleneck Simulator in terms of fixed points of the learned policy, which reveals how performance is affected by four distinct sources of error: an error related to the abstract space structure, an error related to the transition model estimation variance, an error related to the transition model estimation bias, and an error related to the transition model class bias. Finally, we evaluate the Bottleneck Simulator on two natural language processing tasks: a text adventure game and a real-world, complex dialogue response selection task. On both tasks, the Bottleneck Simulator yields excellent performance beating competing approaches.

2020-10-26

Journal of Artificial Intelligence Research (publié)

doi.org

arxiv.org

Association between extreme precipitation, drinking water and acute gastrointestinal illness in the Great Lakes

R. Graydon

M. Mezzacapo

J. Boehme

David L Buckeridge

S. Foldy

T. Edge

J. Brubacher

L. Chan

M. Dellinger

E. Faustman

J. Rose

T. Takaro

2020-10-25

ISEE Conference Abstracts (publié)

doi.org

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications