Alessandro Sordoni

What Makes Machine Reading Comprehension Questions Difﬁcult? Investigating Variation in Passage Sources and Question Types

Susan Bartlett

Grzegorz Kondrak

Max Bartolo

Alastair Roberts

Johannes Welbl

Steven Bird

Ewan Klein

Edward Loper

Samuel R. Bowman

George Dahl. 2021

What

Chao Pang

Junyuan Shang

Jiaxiang Liu

Xuyi Chen

Yanbin Zhao

Yuxiang Lu

Weixin Liu

Zhi-901 hua Wu

Weibao Gong … (see 21 more)

Jianzhong Liang

Zhizhou Shang

Peng Sun

Ouyang Xuan

Dianhai

Hao Tian

Hua Wu

Haifeng Wang

Adam Trischler

Tong Wang

Xingdi Yuan

Justin Har-908

Philip Bachman

Adina Williams

Nikita Nangia

Zhilin Yang

Peng Qi

Saizheng Zhang

Yoshua Bengio

ing. In

For a natural language understanding bench-001 mark to be useful in research, it has to con-002 sist of examples that are diverse and difﬁ… (see more)-003 cult enough to discriminate among current and 004 near-future state-of-the-art systems. However, 005 we do not yet know how best to select pas-006 sages to collect a variety of challenging exam-007 ples. In this study, we crowdsource multiple-008 choice reading comprehension questions for 009 passages taken from seven qualitatively dis-010 tinct sources, analyzing what attributes of pas-011 sages contribute to the difﬁculty and question 012 types of the collected examples. To our sur-013 prise, we ﬁnd that passage source, length, and 014 readability measures do not signiﬁcantly affect 015 question difﬁculty. Through our manual anno-016 tation of seven reasoning types, we observe 017 several trends between passage sources and 018 reasoning types, e.g., logical reasoning is more 019 often required in questions written for techni-020 cal passages. These results suggest that when 021 creating a new benchmark dataset, selecting a 022 diverse set of passages can help ensure a di-023 verse range of question types, but that passage 024 difﬁculty need not be a priority. 025

Recursive Top-Down Production for Sentence Generation with Latent Trees

Shawn Tan

Yikang Shen

Timothy O'Donnell

2020-11-01

Findings of the Association for Computational Linguistics: EMNLP 2020 (published)

doi.org

Explicitly Modeling Syntax in Language Model improves Generalization

Yikang Shen

Shawn Tan

Siva Reddy

Syntax is fundamental to our thinking about language. Although neural networks are very successful in many tasks, they do not explicitly mod… (see more)el syntactic structure. Failing to capture the structure of inputs could lead to generalization problems and over-parametrization. In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM). The model explicitly models the structure with a one-step look-ahead parser and maintains the conditional probability setting of the standard language model. Experiments show that SOM can achieve strong results in language modeling and syntactic generalization tests, while using fewer parameters then other models.

2020-10-21

arXiv.org (preprint)

dblp.uni-trier.de

Ordered Memory

Yikang Shen

Shawn Tan

Seyedarian Hosseini

Zhouhan Lin

2019-10-29

ArXiv (preprint)

Ordered Memory

Yikang Shen

Shawn Tan

Seyedarian Hosseini

Zhouhan Lin

Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficulty… (see more) of training memory models remains a problem obstructing the widespread use of such models. In this paper, we propose the Ordered Memory architecture. Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory. We also introduce a new Gated Recursive Cell to compose lower-level representations into higher-level representation. We demonstrate that our model achieves strong performance on the logical inference task (Bowman et al., 2015) and the ListOps (Nangia and Bowman, 2018) task. We can also interpret the model to retrieve the induced tree structure, and find that these induced structures align with the ground truth. Finally, we evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013), and find that it performs comparatively with the state-of-the-art methods in the literature.

2019-10-29

ArXiv (preprint)

Ordered Memory

Yikang Shen

Shawn Tan

Seyedarian Hosseini

Zhouhan Lin

2019-10-29

ArXiv (preprint)

Brief Report: Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Yikeng Shen

Shawn Tan

An Empirical Study of Example Forgetting during Deep Neural Network Learning

Mariya Toneva*

Remi Tachet des Combes

Adam Trischler

Yoshua Bengio

Geoff Gordon

Inspired by the phenomenon of catastrophic forgetting, we investigate the learning dynamics of neural networks as they train on single class… (see more)ification tasks. Our goal is to understand whether a related phenomenon occurs when data does not undergo a clear distributional shift. We define a “forgetting event” to have occurred when an individual training example transitions from being classified correctly to incorrectly over the course of learning. Across several benchmark data sets, we find that: (i) certain examples are forgotten with high frequency, and some not at all; (ii) a data set’s (un)forgettable examples generalize across neural architectures; and (iii) based on forgetting dynamics, a significant fraction of examples can be omitted from the training data set while still maintaining state-of-the-art generalization performance.

2019-01-01

ICLR.cc/2019/Conference (poster)

openreview.net

Ordered Memory

Yikang Shen

Shawn Tan

Arian Hosseini

Zhouhan Lin

Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficult… (see more)y of training memory models remains a problem obstructing the widespread use of such models. In this paper, we propose the Ordered Memory architecture. Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory. We also introduce a new Gated Recursive Cell to compose lower-level representations into higher-level representation. We demonstrate that our model achieves strong performance on the logical inference task (Bowman et al., 2015) and the ListOps (Nangia and Bowman, 2018) task. We can also interpret the model to retrieve the induced tree structure, and find that these induced structures align with the ground truth. Finally, we evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013), and find that it performs comparatively with the state-of-the-art methods in the literature.

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Yikang Shen

Shawn Tan

Natural language is hierarchically structured: smaller units (e.g., phrases) are nested within larger units (e.g., clauses). When a larger c… (see more)onstituent ends, all of the smaller constituents that are nested within it must also be closed. While the standard LSTM architecture allows different neurons to track information at different time scales, it does not have an explicit bias towards modeling a hierarchy of constituents. This paper proposes to add such an inductive bias by ordering the neurons; a vector of master input and forget gates ensures that when a given neuron is updated, all the neurons that follow it in the ordering are also updated. Our novel recurrent architecture, ordered neurons LSTM (ON-LSTM), achieves good performance on four different tasks: language modeling, unsupervised parsing, targeted syntactic evaluation, and logical inference.

2019-01-01

ICLR.cc/2019/Conference (oral)

openreview.net

Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data

Amjad Almahairi

Sai Rajeswar

Philip Bachman

Learning inter-domain mappings from unpaired data can improve performance in structured prediction tasks, such as image segmentation, by red… (see more)ucing the need for paired data. CycleGAN was recently proposed for this problem, but critically assumes the underlying inter-domain mapping is approximately deterministic and one-to-one. This assumption renders the model ineffective for tasks requiring flexible, many-to-many mappings. We propose a new model, called Augmented CycleGAN, which learns many-to-many mappings between domains. We examine Augmented CycleGAN qualitatively and quantitatively on several image datasets.

2018-07-03

Proceedings of the 35th International Conference on Machine Learning (published)

proceedings.mlr.press