Koustuv Sinha

A Simple and Effective Model for Multi-Hop Question Generation

Jimmy Lei Ba

Jamie Ryan Kiros

Geoffrey E Hin-602

Peter W. Battaglia

Jessica Blake

Chandler Hamrick

Vic-613 tor Bapst

Alvaro Sanchez

Vinicius Zambaldi

M. Malinowski

Andrea Tacchetti

David Raposo

Tom B. Brown

Benjamin Mann

Nick Ryder

Melanie Subbiah

Jared Kaplan

Prafulla Dhariwal

Arvind Neelakantan

Pranav Shyam … (see 72 more)

Girish Sastry

William L. Hamilton

Clutrr

Nitish Srivastava

Geoffrey Hinton

Alex Krizhevsky

Ilya Sutskever

Ruslan Salakhutdinov. 2014

Gabriel Stanovsky

Julian Michael

Luke Zettlemoyer

Dan Su

Yan Xu

Wenliang Dai

Ziwei Ji

Tiezheng Yu

Minghao Tu

Kevin Huang

Guangtao Wang

Jing Huang

Ashish Vaswani

Noam M. Shazeer

Niki Parmar

Jakob Uszkoreit

Llion Jones

Aidan N. Gomez

Łukasz Kaiser

Illia Polosukhin. 2017

Attention

Petar Veliˇckovi´c

Guillem Cucurull

Arantxa Casanova

Adriana Romero Soriano

Pietro Lio’

Yoshua Bengio

Johannes Welbl

Pontus Stenetorp

Yonghui Wu

Mike Schuster

Quoc Zhifeng Chen

Mohammad Le

Wolfgang Norouzi

Macherey

M. Krikun

Yuan Cao

Qin Gao

William W. Cohen

Jianxing Yu

Xiaojun Quan

Qinliang Su

Jian Yin

Yuyu Zhang

Hanjun Dai

Zornitsa Kozareva

Cheng Zhao

Chenyan Xiong

Corby Rosset

Xia

Paul Song

Bennett Saurabh

Tiwary

Yao Zhao

Xiaochuan Ni

Yuanyuan Ding

Qingyu Zhou

Nan Yang

Furu Wei

Chuanqi Tan

Previous research on automated question gen-001 eration has almost exclusively focused on gen-002 erating factoid questions whose answers ca… (see more)n 003 be extracted from a single document. How-004 ever, there is an increasing interest in develop-005 ing systems that are capable of more complex 006 multi-hop question generation (QG), where an-007 swering the question requires reasoning over 008 multiple documents. In this work, we pro-009 pose a simple and effective approach based on 010 the transformer model for multi-hop QG. Our 011 approach consists of specialized input repre-012 sentations, a supporting sentence classiﬁcation 013 objective, and training data weighting. Prior 014 work on multi-hop QG considers the simpli-015 ﬁed setting of shorter documents and also ad-016 vocates the use of entity-based graph struc-017 tures as essential ingredients in model design. 018 On the contrary, we showcase that our model 019 can scale to the challenging setting of longer 020 documents as input, does not rely on graph 021 structures, and substantially outperforms the 022 state-of-the-art approaches as measured by au-023 tomated metrics and human evaluation. 024

2021-01-01

(published)

www.semanticscholar.org

Measuring Systematic Generalization in Neural Proof Generation with Transformers

Nicolas Gontier

Koustuv Sinha

Siva Reddy

Chris Pal

We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded… (see more) in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we perform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference. We observe length-generalization issues when evaluated on longer-than-trained sequences. However, we observe TLMs improve their generalization performance after being exposed to longer, exhaustive proofs. In addition, we discover that TLMs are able to generalize better using backward-chaining proofs compared to their forward-chaining counterparts, while they find it easier to generate forward chaining proofs. We observe that models that are not trained to generate proofs are better at generalizing to problems based on longer proofs. This suggests that Transformers have efficient internal reasoning strategies that are harder to interpret. These results highlight the systematic generalization behavior of TLMs in the context of logical reasoning, and we believe this work motivates deeper inspection of their underlying reasoning strategies.

arxiv.org

A Hierarchical Neural Attention-based Text Classifier

Deep neural networks have been displaying superior performance over traditional supervised classifiers in text classification. They learn to… (see more) extract useful features automatically when sufficient amount of data is presented. However, along with the growth in the number of documents comes the increase in the number of categories, which often results in poor performance of the multiclass classifiers. In this work, we use external knowledge in the form of topic category taxonomies to aide the classification by introducing a deep hierarchical neural attention-based classifier. Our model performs better than or comparable to state-of-the-art hierarchical models at significantly lower computational cost while maintaining high interpretability.

2018-01-01

Conference on Empirical Methods in Natural Language Processing (published)

doi.org

Speed Science

Leading in a New Era

Supervision Requests

Koustuv Sinha

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Koustuv Sinha

Publications