Jackie Cheung

Ines Arous

Collaborating Alumni - McGill University

PhD - McGill University

Collaborating researcher

Nadhem Benhadjali

Collaborating researcher

Meng (Caden) Cao

Collaborating Alumni - McGill University

Google Scholar

Aishik Chakraborty

PhD - McGill University

Khaoula Chehbouni

PhD - McGill University

Principal supervisor :

Master's Research - McGill University

Nelson Filipe Costa

Collaborating researcher - Concordia University University

Maxime Darrin

PhD - McGill University

Co-supervisor :

PhD - McGill University

Google Scholar

Aylin Erman

PhD - McGill University

Co-supervisor :

Dan Poenaru

Ori Ernst

Postdoctorate - McGill University

Master's Research - McGill University

Google Scholar

Zichao Li

PhD - McGill University

Principal supervisor :

Siva Reddy

Caleb Moses

PhD - McGill University

Ian Porada

PhD - McGill University

PhD - McGill University

Sina Salmannia

Undergraduate - McGill University

Cesare Spinoso-Di Piano

PhD - McGill University

Sihui Wei

Undergraduate - McGill University

Xiyuan Zou

Master's Research - McGill University

Publications

Can a Gorilla Ride a Camel? Learning Semantic Plausibility from Text

Ian Porada

Kaheer Suleman

2019-11-01

Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing (published)

Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses

Matt Grenander

Yue Dong

Annie Priyadarshini Louis

Sentence position is a strong feature for news summarization, since the lead often (but not always) summarizes the key points of the article… (see more). In this paper, we show that recent neural systems excessively exploit this trend, which although powerful for many inputs, is also detrimental when summarizing documents where important content should be extracted from later parts of the article. We propose two techniques to make systems sensitive to the importance of content in different parts of the article. The first technique employs ‘unbiased’ data; i.e., randomly shuffled sentences of the source document, to pretrain the model. The second technique uses an auxiliary ROUGE-based loss that encourages the model to distribute importance scores throughout a document by mimicking sentence-level ROUGE scores on the training data. We show that these techniques significantly improve the performance of a competitive reinforcement learning based extractive system, with the auxiliary loss being more powerful than pretraining.

2019-11-01

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (published)

How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG

Paul Trichelair

Ali Emami

Adam Trischler

Kaheer Suleman

Recent studies have significantly improved the state-of-the-art on common-sense reasoning (CSR) benchmarks like the Winograd Schema Challeng… (see more)e (WSC) and SWAG. The question we ask in this paper is whether improved performance on these benchmarks represents genuine progress towards common-sense-enabled systems. We make case studies of both benchmarks and design protocols that clarify and qualify the results of previous work by analyzing threats to the validity of previous experimental designs. Our protocols account for several properties prevalent in common-sense benchmarks including size limitations, structural regularities, and variable instance difficulty.

2019-11-01

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (published)

Referring Expression Generation Using Entity Profiles

Meng Cao

2019-11-01

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (published)

Contextualized Non-local Neural Networks for Sequence Learning

Pengfei Liu

Shuaichen Chang

Xuanjing Huang

Jian Tang

Recently, a large number of neural mechanisms and models have been proposed for sequence learning, of which selfattention, as exemplified by… (see more) the Transformer model, and graph neural networks (GNNs) have attracted much attention. In this paper, we propose an approach that combines and draws on the complementary strengths of these two methods. Specifically, we propose contextualized non-local neural networks (CN3), which can both dynamically construct a task-specific structure of a sentence and leverage rich local dependencies within a particular neighbourhood.Experimental results on ten NLP tasks in text classification, semantic matching, and sequence labelling show that our proposed model outperforms competitive baselines and discovers task-specific dependency structures, thus providing better interpretability to users.

2019-07-17

Proceedings of the AAAI Conference on Artificial Intelligence (published)

Generating Character Descriptions for Automatic Summarization of Fiction

Weiwei Zhang

J. Oren

Summaries of fictional stories allow readers to quickly decide whether or not a story catches their interest. A major challenge in automatic… (see more) summarization of fiction is the lack of standardized evaluation methodology or high-quality datasets for experimentation. In this work, we take a bottomup approach to this problem by assuming that story authors are uniquely qualified to inform such decisions. We collect a dataset of one million fiction stories with accompanying author-written summaries from Wattpad, an online story sharing platform. We identify commonly occurring summary components, of which a description of the main characters is the most frequent, and elicit descriptions of main characters directly from the authors for a sample of the stories. We propose two approaches to generate character descriptions, one based on ranking attributes found in the story text, the other based on classifying into a list of pre-defined attributes. We find that the classification-based approach performs the best in predicting character descriptions.

2019-07-17

Proceedings of the AAAI Conference on Artificial Intelligence (published)

Learning Multi-Task Communication with Message Passing for Sequence Learning

Pengfei Liu

Jie Fu

Yue Dong

Xipeng Qiu

We present two architectures for multi-task learning with neural sequence models. Our approach allows the relationships between different ta… (see more)sks to be learned dynamically, rather than using an ad-hoc pre-defined structure as in previous work. We adopt the idea from message-passing graph neural networks, and propose a general graph multi-task learning framework in which different tasks can communicate with each other in an effective and interpretable way. We conduct extensive experiments in text classification and sequence labelling to evaluate our approach on multi-task learning and transfer learning. The empirical results show that our models not only outperform competitive baselines, but also learn interpretable and transferable patterns across tasks.

2019-07-17

Proceedings of the AAAI Conference on Artificial Intelligence (published)

A Cross-Domain Transferable Neural Coherence Model

Peng Xu

H. Saghir

Jin Sung Kang

Teng Long

Joey Bose

Yanshuai Cao

Coherence is an important aspect of text quality and is crucial for ensuring its readability. One important limitation of existing coherence… (see more) models is that training on one domain does not easily generalize to unseen categories of text. Previous work advocates for generative models for cross-domain generalization, because for discriminative models, the space of incoherent sentence orderings to discriminate against during training is prohibitively large. In this work, we propose a local discriminative neural model with a much smaller negative sampling space that can efficiently learn against incorrect orderings. The proposed coherence model is simple in structure, yet it significantly outperforms previous state-of-art methods on a standard benchmark dataset on the Wall Street Journal corpus, as well as in multiple new challenging settings of transfer to unseen categories of discourse on Wikipedia articles.

2019-07-01

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (published)

EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing

Yue Dong

Zichao Li

Mehdi Rezagholizadeh

We present the first sentence simplification model that learns explicit edit operations (ADD, DELETE, and KEEP) via a neural programmer-inte… (see more)rpreter approach. Most current neural sentence simplification systems are variants of sequence-to-sequence models adopted from machine translation. These methods learn to simplify sentences as a byproduct of the fact that they are trained on complex-simple sentence pairs. By contrast, our neural programmer-interpreter is directly trained to predict explicit edit operations on targeted parts of the input sentence, resembling the way that humans perform simplification and revision. Our model outperforms previous state-of-the-art neural sentence simplification models (without external knowledge) by large margins on three benchmark text simplification corpora in terms of SARI (+0.95 WikiLarge, +1.89 WikiSmall, +1.41 Newsela), and is judged by humans to produce overall better and simpler output sentences.

2019-07-01

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (published)

Understanding the Behaviour of Neural Abstractive Summarizers using Contrastive Examples

Krtin Kumar

Neural abstractive summarizers generate summary texts using a language model conditioned on the input source text, and have recently achieve… (see more)d high ROUGE scores on benchmark summarization datasets. We investigate how they achieve this performance with respect to human-written gold-standard abstracts, and whether the systems are able to understand deeper syntactic and semantic structures. We generate a set of contrastive summaries which are perturbed, deficient versions of human-written summaries, and test whether existing neural summarizers score them more highly than the human-written summaries. We analyze their performance on different datasets and find that these systems fail to understand the source text, in a majority of the cases.

2019-06-01

North American Chapter of the Association for Computational Linguistics (published)

Unsupervised Controllable Text Generation with Global Variation Discovery and Disentanglement

Peng Xu

Yanshuai Cao

Existing controllable text generation systems rely on annotated attributes, which greatly limits their capabilities and applications. In thi… (see more)s work, we make the first successful attempt to use VAEs to achieve controllable text generation without supervision. We do so by decomposing the latent space of the VAE into two parts: one incorporates structural constraints to capture dominant global variations implicitly present in the data, e.g., sentiment or topic; the other is unstructured and is used for the reconstruction of the source sentences. With the enforced structural constraint, the underlying global variations will be discovered and disentangled during the training of the VAE. The structural constraint also provides a natural recipe for mitigating posterior collapse for the structured part, which cannot be fully resolved by the existing techniques. On the task of text style transfer, our unsupervised approach achieves significantly better performance than previous supervised approaches. By showcasing generation with finer-grained control including Cards-Against-Humanity-style topic transitions within a sentence, we demonstrate that our model can perform controlled text generation in a more flexible way than existing methods.

2019-05-28

ArXiv (preprint)

What comes next? Extractive summarization by next-sentence prediction

Jingyun Liu

Annie Priyadarshini Louis

Existing approaches to automatic summarization assume that a length limit for the summary is given, and view content selection as an optimiz… (see more)ation problem to maximize informativeness and minimize redundancy within this budget. This framework ignores the fact that human-written summaries have rich internal structure which can be exploited to train a summarization system. We present NEXTSUM, a novel approach to summarization based on a model that predicts the next sentence to include in the summary using not only the source article, but also the summary produced so far. We show that such a model successfully captures summary-specific discourse moves, and leads to better content selection performance, in addition to automatically predicting how long the target summary should be. We perform experiments on the New York Times Annotated Corpus of summaries, where NEXTSUM outperforms lead and content-model summarization baselines by significant margins. We also show that the lengths of summaries produced by our system correlates with the lengths of the human-written gold standards.

2019-01-12

ArXiv (preprint)