Jackie Cheung

Nadhem Benhadjali

Collaborating researcher

Meng (Caden) Cao

PhD - McGill University

Google Scholar

Aishik Chakraborty

PhD - McGill University

Khaoula Chehbouni

PhD - McGill University

Principal supervisor :

Master's Research - McGill University

Maxime Darrin

PhD - McGill University

Co-supervisor :

PhD - McGill University

Google Scholar

Aylin Erman

PhD - McGill University

Co-supervisor :

Dan Poenaru

Ori Ernst

Postdoctorate - McGill University

Master's Research - McGill University

Google Scholar

Jules Gagnon-marchand

Master's Research - McGill University

Sienna Hsu

Research Intern - McGill University University

Zichao Li

PhD - McGill University

Principal supervisor :

Siva Reddy

Caleb Moses

PhD - McGill University

Ian Porada

PhD - McGill University

PhD - McGill University

Sina Salmannia

Undergraduate - McGill University

Cesare Spinoso-Di Piano

PhD - McGill University

Sihui Wei

Undergraduate - McGill University

Xiyuan Zou

Master's Research - McGill University

Publications

BanditSum: Extractive Summarization as a Contextual Bandit

Yue Dong

Yikang Shen

Eric Crawford

Herke van Hoof

In this work, we propose a novel method for training neural networks to perform single-document extractive summarization without heuristical… (see more)ly-generated extractive labels. We call our approach BanditSum as it treats extractive summarization as a contextual bandit (CB) problem, where the model receives a document to summarize (the context), and chooses a sequence of sentences to include in the summary (the action). A policy gradient reinforcement learning algorithm is used to train the model to select sequences of sentences that maximize ROUGE score. We perform a series of experiments demonstrating that BanditSum is able to achieve ROUGE scores that are better than or comparable to the state-of-the-art for extractive summarization, and converges using significantly fewer update steps than competing approaches. In addition, we show empirically that BanditSum performs significantly better than competing approaches when good summary sentences appear late in the source document.

2018-10-01

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (published)

A Knowledge Hunting Framework for Common Sense Reasoning

Ali Emami

Noelia De La Cruz

Adam Trischler

Kaheer Suleman

We introduce an automatic system that achieves state-of-the-art results on the Winograd Schema Challenge (WSC), a common sense reasoning tas… (see more)k that requires diverse, complex forms of inference and knowledge. Our method uses a knowledge hunting module to gather text from the web, which serves as evidence for candidate problem resolutions. Given an input problem, our system generates relevant queries to send to a search engine, then extracts and classifies knowledge from the returned results and weighs them to make a resolution. Our approach improves F1 performance on the full WSC by 0.21 over the previous best and represents the first system to exceed 0.5 F1. We further demonstrate that the approach is competitive on the Choice of Plausible Alternatives (COPA) task, which suggests that it is generally applicable.

2018-10-01

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (published)

Let’s do it “again”: A First Computational Approach to Detecting Adverbial Presupposition Triggers

Andre Cianflone

Yulan Feng

Jad Kabbara

We introduce the novel task of predicting adverbial presupposition triggers, which is useful for natural language generation tasks such as s… (see more)ummarization and dialogue systems. We introduce two new corpora, derived from the Penn Treebank and the Annotated English Gigaword dataset and investigate the use of a novel attention mechanism tailored to this task. Our attention mechanism augments a baseline recurrent neural network without the need for additional trainable parameters, minimizing the added computational cost of our mechanism. We demonstrate that this model statistically outperforms our baselines.

2018-07-01

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (published)

Coarse Lexical Frame Acquisition at the Syntax–Semantics Interface Using a Latent-Variable PCFG Model

Laura Kallmeyer

Behrang QasemiZadeh

We present a method for unsupervised lexical frame acquisition at the syntax–semantics interface. Given a set of input strings derived fro… (see more)m dependency parses, our method generates a set of clusters that resemble lexical frame structures. Our work is motivated not only by its practical applications (e.g., to build, or expand the coverage of lexical frame databases), but also to gain linguistic insight into frame structures with respect to lexical distributions in relation to grammatical structures. We model our task using a hierarchical Bayesian network and employ tools and methods from latent variable probabilistic context free grammars (L-PCFGs) for statistical inference and parameter fitting, for which we propose a new split and merge procedure. We show that our model outperforms several baselines on a portion of the Wall Street Journal sentences that we have newly annotated for evaluation purposes.

2018-06-01

Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics (published)

Commonsense mining as knowledge base completion? A study on the impact of novelty

Stanisław Jastrzębski

Dzmitry Bahdanau

Seyedarian Hosseini

Michael Noukhovitch

Yoshua Bengio

Commonsense knowledge bases such as ConceptNet represent knowledge in the form of relational triples. Inspired by recent work by Li et al., … (see more)we analyse if knowledge base completion models can be used to mine commonsense knowledge from raw text. We propose novelty of predicted triples with respect to the training set as an important factor in interpreting results. We critically analyse the difficulty of mining novel commonsense knowledge, and show that a simple baseline method that outperforms the previous state of the art on predicting more novel triples.

2018-06-01

Proceedings of the Workshop on Generalization in the Age of Deep Learning (published)

A Generalized Knowledge Hunting Framework for the Winograd Schema Challenge

Ali Emami

Adam Trischler

Kaheer Suleman

We introduce an automatic system that performs well on two common-sense reasoning tasks, the Winograd Schema Challenge (WSC) and the Choice … (see more)of Plausible Alternatives (COPA). Problem instances from these tasks require diverse, complex forms of inference and knowledge to solve. Our method uses a knowledge-hunting module to gather text from the web, which serves as evidence for candidate problem resolutions. Given an input problem, our system generates relevant queries to send to a search engine. It extracts and classifies knowledge from the returned results and weighs it to make a resolution. Our approach improves F1 performance on the WSC by 0.16 over the previous best and is competitive with the state-of-the-art on COPA, demonstrating its general applicability.

2018-06-01

North American Chapter of the Association for Computational Linguistics (published)

Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization

Kian Kenyon-Dean

Doina Precup

We present an approach to event coreference resolution by developing a general framework for clustering that uses supervised representation … (see more)learning. We propose a neural network architecture with novel Clustering-Oriented Regularization (CORE) terms in the objective function. These terms encourage the model to create embeddings of event mentions that are amenable to clustering. We then use agglomerative clustering on these embeddings to build event coreference chains. For both within- and cross-document coreference on the ECB+ corpus, our model obtains better results than models that require significantly more pre-annotated information. This work provides insight and motivating results for a new general approach to solving coreference and clustering problems with representation learning.

2018-06-01

Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics (published)

Advances in Artificial Intelligence

Ebrahim Bagheri

2018-01-01

Lecture Notes in Computer Science (published)

Advances in Artificial Intelligence

Ebrahim Bagheri

2018-01-01

Lecture Notes in Computer Science (published)

A Hierarchical Neural Attention-based Text Classifier

Koustuv Sinha

Yue Dong

Derek Ruths

Deep neural networks have been displaying superior performance over traditional supervised classifiers in text classification. They learn to… (see more) extract useful features automatically when sufficient amount of data is presented. However, along with the growth in the number of documents comes the increase in the number of categories, which often results in poor performance of the multiclass classifiers. In this work, we use external knowledge in the form of topic category taxonomies to aide the classification by introducing a deep hierarchical neural attention-based classifier. Our model performs better than or comparable to state-of-the-art hierarchical models at significantly lower computational cost while maintaining high interpretability.

2018-01-01

Conference on Empirical Methods in Natural Language Processing (published)

World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions

Teng Long

Emmanuel Bengio

Ryan Lowe

Doina Precup

Humans interpret texts with respect to some background information, or world knowledge, and we would like to develop automatic reading compr… (see more)ehension systems that can do the same. In this paper, we introduce a task and several models to drive progress towards this goal. In particular, we propose the task of rare entity prediction: given a web document with several entities removed, models are tasked with predicting the correct missing entities conditioned on the document context and the lexical resources. This task is challenging due to the diversity of language styles and the extremely large number of rare entities. We propose two recurrent neural network architectures which make use of external knowledge in the form of entity descriptions. Our experiments show that our hierarchical LSTM model performs significantly better at the rare entity prediction task than those that do not make use of external resources.

2017-09-01

Conference on Empirical Methods in Natural Language Processing (published)

Predicting Success in Goal-Driven Human-Human Dialogues

Michael Noseworthy

Joelle Pineau

In goal-driven dialogue systems, success is often defined based on a structured definition of the goal. This requires that the dialogue syst… (see more)em be constrained to handle a specific class of goals and that there be a mechanism to measure success with respect to that goal. However, in many human-human dialogues the diversity of goals makes it infeasible to define success in such a way. To address this scenario, we consider the task of automatically predicting success in goal-driven human-human dialogues using only the information communicated between participants in the form of text. We build a dataset from stackoverflow.com which consists of exchanges between two users in the technical domain where ground-truth success labels are available. We then propose a turn-based hierarchical neural network model that can be used to predict success without requiring a structured goal definition. We show this model outperforms rule-based heuristics and other baselines as it is able to detect patterns over the course of a dialogue and capture notions such as gratitude.

2017-08-01

SIGDIAL Conference (published)