Kushal Arora

The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation

Kushal Arora

Timothy O'Donnell

Doina Precup

Jason Aaron Edward Weston

Jackie C.K.Cheung

State-of-the-art language generation models can degenerate when applied to open-ended generation problems such as text completion, story gen… (voir plus)eration, or dialog modeling. This degeneration usually shows up in the form of incoherence, lack of vocabulary diversity, and self-repetition or copying from the context. In this paper, we postulate that ``human-like'' generations usually lie in a narrow and nearly flat entropy band, and violation of these entropy bounds correlates with degenerate behavior. Our experiments show that this stable narrow entropy zone exists across models, tasks, and domains and confirm the hypothesis that violations of this zone correlate with degeneration. We then use this insight to propose an entropy-aware decoding algorithm that respects these entropy bounds resulting in less degenerate, more contextual, and"human-like"language generation in open-ended text generation settings.

2023-02-14

ArXiv (prépublication)

doi.org

arxiv.org

Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation

Kushal Arora

Layla El Asri

Hareesh Bahuleyan

Jackie Cheung

Current language generation models suffer from issues such as repetition, incoherence, and hallucinations. An often-repeated hypothesis for … (voir plus)this brittleness of generation models is that it is caused by the training and the generation procedure mismatch, also referred to as exposure bias. In this paper, we verify this hypothesis by analyzing exposure bias from an imitation learning perspective. We show that exposure bias leads to an accumulation of errors during generation, analyze why perplexity fails to capture this accumulation of errors, and empirically show that this accumulation results in poor generation quality.

2022-04-03

ArXiv (prépublication)

doi.org

arxiv.org

Learning Lexical Subspaces in a Distributional Vector Space

Kushal Arora

Aishik Chakraborty

Jackie Cheung

Abstract In this paper, we propose LexSub, a novel approach towards unifying lexical and distributional semantics. We inject knowledge about… (voir plus) lexical-semantic relations into distributional word embeddings by defining subspaces of the distributional vector space in which a lexical relation should hold. Our framework can handle symmetric attract and repel relations (e.g., synonymy and antonymy, respectively), as well as asymmetric relations (e.g., hypernymy and meronomy). In a suite of intrinsic benchmarks, we show that our model outperforms previous approaches on relatedness tasks and on hypernymy classification and detection, while being competitive on word similarity tasks. It also outperforms previous systems on extrinsic classification tasks that benefit from exploiting lexical relational cues. We perform a series of analyses to understand the behaviors of our model.1 Code available at https://github.com/aishikchakraborty/LexSub.

2020-12-01

Transactions of the Association for Computational Linguistics (publié)

doi.org

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Kushal Arora

Publications

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Kushal Arora

Publications