Publications

Mitigating Calibration Bias Without Fixed Attribute Grouping for Improved Fairness in Medical Imaging Analysis

Changjian Shui

Justin Szeto

Raghav Mehta

Douglas Arnold

Tal Arbel

2023-10-08

OpenReview.net/Archive (publié)

doi.org

openreview.net

Better Quality Pre-training Data and T5 Models for African Languages

Akintunde Oladipo

Mofetoluwa Adeyemi

Orevaoghene Ahia

Abraham Toluwase Owodunni

Odunayo Ogundepo

David Ifeoluwa Adelani

Jimmy Lin

In this study, we highlight the importance of enhancing the quality of pretraining data in multilingual language models. Existing web crawl… (voir plus)s have demonstrated quality issues, particularly in the context of low-resource languages. Consequently, we introduce a new multilingual pretraining corpus for

2023-10-07

EMNLP/2023/Conference (accepté)

doi.org

openreview.net

Crystal-GFN: sampling crystals with desirable properties and constraints

Alex Hernandez-Garcia

Alexandre AGM Duval

Alexandra Volokhova

Yoshua Bengio

Divya Sharma

pierre luc carrier

Michał Koziarski

Victor Schmidt

Accelerating material discovery holds the potential to greatly help mitigate the climate crisis. Discovering new solid-state materials such … (voir plus)as electrocatalysts, super-ionic conductors or photovoltaic materials can have a crucial impact, for instance, in improving the efficiency of renewable energy production and storage. In this paper, we introduce Crystal-GFN, a generative model of crystal structures that sequentially samples structural properties of crystalline materials, namely the space group, composition and lattice parameters. This domain-inspired approach enables the flexible incorporation of physical and structural hard constraints, as well as the use of any available predictive model of a desired physicochemical property as an objective function. To design stable materials, one must target the candidates with the lowest formation energy. Here, we use as objective the formation energy per atom of a crystal structure predicted by a new proxy machine learning model trained on MatBench. The results demonstrate that Crystal-GFN is able to sample highly diverse crystals with low (median -3.1 eV/atom) predicted formation energy.

2023-10-07

ArXiv (prépublication)

doi.org

arxiv.org

Efficient Classification of Long Documents via State-Space Models

Peng Lu

Suyuchen Wang

Mehdi Rezagholizadeh

Bang Liu

Ivan Kobyzev

2023-10-07

EMNLP/2023/Conference (accepté)

openreview.net

EpiK-Eval: Evaluation for Language Models as Epistemic Models

Gabriele Prato

Jerry Huang

Prasanna Parthasarathi

Shagun Sodhani

Sarath Chandar Anbil Parthipan

In the age of artificial intelligence, the role of large language models (LLMs) is becoming increasingly central. Despite their growing prev… (voir plus)alence, their capacity to consolidate knowledge from different training documents—a crucial ability in numerous applications—remains unexplored. This paper presents the first study examining the capability of LLMs to effectively combine such information within their parameter space. We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives. Evaluations across various LLMs reveal significant weaknesses in this domain. We contend that these shortcomings stem from the intrinsic nature of prevailing training objectives. Consequently, we advocate for refining the approach towards knowledge consolidation, as it harbors the potential to dramatically improve their overall effectiveness and performance. The findings from this study offer insights for developing more robust and reliable LLMs. Our code and benchmark are available at https://github.com/chandar-lab/EpiK-Eval

2023-10-07

EMNLP/2023/Conference (accepté)

doi.org

openreview.net

HoneyBee: Progressive Instruction Finetuning of Large Language Models for Materials Science

Yu Song

Santiago Miret

Huan Zhang

Bang Liu

2023-10-07

EMNLP/2023/Conference (publié)

openreview.net

Investigating the Effect of Pre-finetuning BERT Models on NLI Involving Presuppositions

Jad Kabbara

Jackie Cheung

2023-10-07

EMNLP/2023/Conference (publié)

openreview.net

MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

Yuyan Chen

Zhihao Wen

Ge Fan

Zhengyu Chen

Wei Wu

Dayiheng Liu

Zhixu Li

Bang Liu

Yanghua Xiao

2023-10-07

EMNLP/2023/Conference (publié)

openreview.net

MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model

Le Zhang

Yihong Wu

Fengran Mo

Jian-Yun Nie

Aishwarya Agrawal

Multi-modal open-domain question answering typically requires evidence retrieval from databases across diverse modalities, such as images, t… (voir plus)ables, passages, etc. Even Large Language Models (LLMs) like GPT-4 fall short in this task. To enable LLMs to tackle the task in a zero-shot manner, we introduce MoqaGPT, a straightforward and flexible framework. Using a divide-and-conquer strategy that bypasses intricate multi-modality ranking, our framework can accommodate new modalities and seamlessly transition to new models for the task. Built upon LLMs, MoqaGPT retrieves and extracts answers from each modality separately, then fuses this multi-modal information using LLMs to produce a final answer. Our methodology boosts performance on the MMCoQA dataset, improving F1 by +37.91 points and EM by +34.07 points over the supervised baseline. On the MultiModalQA dataset, MoqaGPT surpasses the zero-shot baseline, improving F1 by 9.5 points and EM by 10.1 points, and significantly closes the gap with supervised methods. Our codebase is available at https://github.com/lezhang7/MOQAGPT.

2023-10-07

EMNLP/2023/Conference (publié)

doi.org

openreview.net

RainProof: An Umbrella to Shield Text Generator from Out-Of-Distribution Data

Maxime Darrin

Pablo Piantanida

Pierre Colombo

Implementing effective control mechanisms to ensure the proper functioning and security of deployed NLP models, from translation to chatbots… (voir plus), is essential. A key ingredient to ensure safe system behaviour is Out-Of-Distribution (OOD) detection, which aims to detect whether an input sample is statistically far from the training distribution. Although OOD detection is a widely covered topic in classification tasks, most methods rely on hidden features output by the encoder. In this work, we focus on leveraging soft-probabilities in a black-box framework, i.e. we can access the soft-predictions but not the internal states of the model. Our contributions include: (i) RAINPROOF a Relative informAItioN Projection OOD detection framework; and (ii) a more operational evaluation setting for OOD detection. Surprisingly, we find that OOD detection is not necessarily aligned with task-specific measures. The OOD detector may filter out samples well processed by the model and keep samples that are not, leading to weaker performance. Our results show that RAINPROOF provides OOD detection methods more aligned with task-specific performance metrics than traditional OOD detectors.

2023-10-07

EMNLP/2023/Conference (accepté)

openreview.net

Responsible AI Considerations in Text Summarization Research: A Review of Current Practices

Yu Lu Liu

Meng Cao

Su Lin Blodgett

Jackie Cheung

Alexandra Olteanu

Adam Trischler

AI and NLP publication venues have increasingly encouraged researchers to reflect on possible ethical considerations, adverse impacts, and o… (voir plus)ther responsible AI issues their work might engender. However, for specific NLP tasks our understanding of how prevalent such issues are, or when and why these issues are likely to arise, remains limited. Focusing on text summarization—a common NLP task largely overlooked by the responsible AI community—we examine research and reporting practices in the current literature. We conduct a multi-round qualitative analysis of 333 summarization papers from the ACL Anthology published between 2020–2022. We focus on how, which, and when responsible AI issues are covered, which relevant stakeholders are considered, and mismatches between stated and realized research goals. We also discuss current evaluation practices and consider how authors discuss the limitations of both prior work and their own work. Overall, we find that relatively few papers engage with possible stakeholders or contexts of use, which limits their consideration of potential downstream adverse impacts or other responsible AI issues. Based on our findings, we make recommendations on concrete practices and research directions.

2023-10-07

EMNLP/2023/Conference (publié)

openreview.net

Sparse Universal Transformer

Shawn Tan

Yikang Shen

Zhenfang Chen

Aaron Courville

Chuang Gan

The Universal Transformer (UT) is a variant of the Transformer that shares parameters across its layers and is Turing-complete under certain… (voir plus) assumptions. Empirical evidence also shows that UTs have better compositional generalization than Vanilla Transformers (VTs) in formal language tasks. The parameter-sharing also affords it better parameter efficiency than VTs. Despite its many advantages, most state-of-the-art NLP systems use VTs as their backbone model instead of UTs. This is mainly because scaling UT parameters is more compute and memory intensive than scaling up a VT. This paper proposes the Sparse Universal Transformer (SUT), which leverages Sparse Mixture of Experts (SMoE) to reduce UT's computation complexity while retaining its parameter efficiency and generalization ability. Experiments show that SUT combines the best of both worlds, achieving strong generalization results on formal language tasks (Logical inference and CFQ) and impressive parameter and computation efficiency on standard natural language benchmarks like WMT'14.

2023-10-07

EMNLP/2023/Conference (accepté)

doi.org

openreview.net

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Publications

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications