Publications

Exploring Quantization for Efficient Pre-Training of Transformer Language Models

Kamran Chitsaz

A. Chandar

The increasing scale of Transformer models has led to an increase in their pre-training computational requirements. While quantization has p… (see more)roven to be effective after pre-training and during fine-tuning, applying quantization in Transformers during pre-training has remained largely unexplored at scale for language modeling. This study aims to explore the impact of quantization for efficient pre-training of Transformers, with a focus on linear layer components. By systematically applying straightforward linear quantization to weights, activations, gradients, and optimizer states, we assess its effects on model efficiency, stability, and performance during training. By offering a comprehensive recipe of effective quantization strategies to be applied during the pre-training of Transformers, we promote high training efficiency from scratch while retaining language modeling ability. Code is available at https://github.com/chandar-lab/EfficientLLMs.

2023-12-31

EMNLP (Findings) (published)

doi.org

arxiv.org

Exploring the digital divide: results of a survey informing mobile application development

Maira Corinne Claudio

Zachary Rehany

Katerina Stachtari

Elena Guadagno

Esli Osmanlliu

Dan Poenaru

Introduction Mobile health apps risk widening health disparities if they overlook digital inclusion. The digital divide, encompassing access… (see more), familiarity, and readiness, poses a significant barrier to medical interventions. Existing literature lacks exploration of the digital divide's contributing factors. Hence, data are needed to comprehend the challenges in developing inclusive health apps. Methods We created a survey to gauge internet and smartphone access, smartphone familiarity, and readiness for using mobile health apps among caregivers of pediatric patients in tertiary care. Open-ended questions solicited feedback and suggestions on mobile health applications. Responses were categorized by similarity and compared. Developed with patient partners, the survey underwent cognitive testing and piloting for accuracy. Results Data from 209 respondents showed that 23% were affected by the digital divide, mainly due to unfamiliarity with digital skills. Among 49 short text responses about health app concerns, 31 mentioned security and confidentiality, with 7 mentioning the impersonal nature of such apps. Desired features included messaging healthcare providers, scheduling, task reminders, and simplicity. Conclusions This study underscores a digital divide among caregivers of pediatric patients, with nearly a quarter affected primarily due to a lack of digital comfort. Respondents emphasized user-friendliness and online security for health apps. Future apps should prioritize digital inclusion by addressing the significant barriers and carefully considering patient and family concerns.

2023-12-31

Frontiers Digit. Health (published)

doi.org

Fairness Through Domain Awareness: Mitigating Popularity Bias For Music Discovery

Rebecca Salganik

Fernando Diaz

Golnoosh Farnadi

As online music platforms grow, music recommender systems play a vital role in helping users navigate and discover content within their vast… (see more) musical databases. At odds with this larger goal, is the presence of popularity bias, which causes algorithmic systems to favor mainstream content over, potentially more relevant, but niche items. In this work we explore the intrinsic relationship between music discovery and popularity bias. To mitigate this issue we propose a domain-aware, individual fairness-based approach which addresses popularity bias in graph neural network (GNNs) based recommender systems. Our approach uses individual fairness to reflect a ground truth listening experience, i.e., if two songs sound similar, this similarity should be reflected in their representations. In doing so, we facilitate meaningful music discovery that is robust to popularity bias and grounded in the music domain. We apply our BOOST methodology to two discovery based tasks, performing recommendations at both the playlist level and user level. Then, we ground our evaluation in the cold start setting, showing that our approach outperforms existing fairness benchmarks in both performance and recommendation of lesser-known content. Finally, our analysis explains why our proposed methodology is a novel and promising approach to mitigating popularity bias and improving the discovery of new and niche content in music recommender systems.

2023-12-31

ECIR (4) (published)

doi.org

arxiv.org

Fairness Under Demographic Scarce Regime

Patrik Joslin Kenfack

S Ebrahimi Kahou

Ulrich Matchi Aïvodji

Most existing works on fairness assume the model has full access to demographic information. However, there exist scenarios where demographi… (see more)c information is partially available because a record was not maintained throughout data collection or due to privacy reasons. This setting is known as demographic scarce regime. Prior research have shown that training an attribute classifier to replace the missing sensitive attributes (proxy) can still improve fairness. However, the use of proxy-sensitive attributes worsens fairness-accuracy trade-offs compared to true sensitive attributes. To address this limitation, we propose a framework to build attribute classifiers that achieve better fairness-accuracy trade-offs. Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes is detrimental to fairness and accuracy. Our experiments on two datasets showed that the proposed framework yields models with significantly better fairness-accuracy trade-offs compared to classic attribute classifiers. Surprisingly, our framework outperforms models trained with constraints on the true sensitive attributes.

2023-12-31

Trans. Mach. Learn. Res. (published)

doi.org

openreview.net

Findings of the 2nd Shared Task on Multi-lingual Multi-task Information Retrieval at MRL 2024

Francesco Tinner

Raghav Mantri

Mammad Hajili

Chiamaka Ijeoma Chukwuneke

Dylan Massey

Benjamin A. Ajibade

Bilge Kocak

Abolade Dawud

Jonathan Atala

Hale Sirin

Kayode Olaleye

Anar Rzayev

David Ifeoluwa Adelani

Duygu Ataman

Large language models (LLMs) demonstrate exceptional proficiency in both the comprehension and generation of textual data, particularly in E… (see more)nglish, a language for which extensive public benchmarks have been established across a wide range of natural language processing (NLP) tasks. Nonetheless, their performance in multilingual contexts and specialized domains remains less rigorously validated, raising questions about their reliability and generalizability across linguistically diverse and domain-specific settings. The second edition of the Shared Task on Multilingual Multitask Information Retrieval aims to provide a comprehensive and inclusive multilingual evaluation benchmark which aids assessing the ability of multilingual LLMs to capture logical, factual, or causal relationships within lengthy text contexts and generate language under sparse settings, particularly in scenarios with under-resourced languages. The shared task consists of two subtasks crucial to information retrieval: Named entity recognition (NER) and reading comprehension (RC), in 7 data-scarce languages: Azerbaijani, Swiss German, Turkish and , which previously lacked annotated resources in information retrieval tasks. This year specifally focus on the multiple-choice question answering evaluation setting which provides a more objective setting for comparing different methods across languages.

2023-12-31

MRL (published)

doi.org

Findings of the Association for Computational Linguistics: NAACL 2024, Mexico City, Mexico, June 16-21, 2024

Mohamed Abdalla

Gavin Abercrombie

David Ifeoluwa Adelani

Rodrigo Agerri

Zeljko Agic

Eneko Agirre

Monica Agrawal

Wasi Uddin Ahmad

James Allan

Aijun An

Antonios Anasta-sopoulos

Mark Anderson

Jacob Andreas

Marianna Apidianaki

Alessio Palmero

Yuki Aprosio

Ehsaneddin Arase

Giuseppe Asgari

Wilker Attardi

Aziz JinYeong … (see 480 more)

Timothy Bak

Mohamad Hardyman Baldwin

Pierpaolo Barawi

Ali Basile

Ja-smijn Basirat

Timo Bastings

Gábor Baumann

Eyal Bella

Farah Ben-David

Luciana Benamara

Benotti Yevgeni

Brijesh Berzak

Federico Bhatt

Chris Bianchi

Lidong Biemann

Alexandra Bing

Birch Eduardo

Gemma Blanco

Aurélien Boleda

Florian Bossard

Leonid Boudin

Ronan Boytsov

Pavel Le Bras

Chris Braslavski

Eleftheria Brew

Thomas Briakou

Emanuele Brochhagen

Wray Buglia-rello

Buntine Elena

Aoife Cabrio

Ruken Cahill

Jose Cakici

Marie Camacho-Collados

Pengfei Candito

Ziqiang Cao

Dallas Cao

Paula Card

Tommaso Carvalho

Andrew Caselli

Tanmoy Cattle

Ilias Chakrabor-ty

Angel X Chalkidis

Ching-Yun Chang

Snigdha Chang

Chen Chaturvedi

Kehai Chen

Long Chen

Lu Chen

Muhao Chen

Wei Chen

Wenhu Chen

Wenliang Chen

Xiang Chen

Yidong Chen

Yun-Nung Chen

Zhiyu Chen

Zhuang Chen

Hao Chen

Yu Cheng

Colin Cheng

Cherry Hai

Eunsol Leong Chieu

Leshem Choi

Monojit Choshen

Christos Choudhury

Yi-Ling Christodoulopou-los

Stephen Chung

Vincent Clark

Simone Claveau

John M Conia

Caio Filippo Conroy

Mathias Corro

Leyang Creutz

Aron Cui

Anna E Culotta

Amanda Cercas Currey

Curry Raj

Daniel Dabre

Cristian Dakota

Verna Danescu-Niculescu-Mizil

Budhaditya Dankers

Deb Vera

Zhenyun Demberg

Li Deng

Ruihai Dong

Antoine Dong

Eduard Doucet

Nan Dragut

Kevin Duan

Greg Duh

Ondrej Durrett

Tomasz Dusek

Dwojak Julian Martin

Asif Eisenschlos

Yanai Ekbal

Cristina Elazar

Luis España-Bonet

Espinosa-Anke Allyson

Kilian Ettinger

Evang Alexander

Agnieszka Fabbri

Meng Falenska

Marcello Fang

Hao Federico

Anna Fei

Feldman Naomi

Fuli Feldman

Xiaocheng Feng

Yansong Feng

Eric Feng

Francis Le Ferrand

Eli-sabetta Ferraro

Simone Fersini

Mark Filice

Mark Finlayson

Jennifer Fishel

Annemarie Foster

Friedrich Matthias

Zhe Gallé

Siddhant Gan

Judith Garg

Kallirroi Gaspers

Alborz Georgila

Geramifard Luke

Mor Gessler

Abbas Geva

Sahar Ghaddar

Filip Ghannay

Mario Ginter

Tejas Giulianelli

Sharon Gokhale

Rob Goldwater

Kyle van der Goot

Tanya Gorman

Jia-Chen Goyal

Qing-Wei Gu

Frank Gu

Lin Guerin

Honglei Gui

Qipeng Guo

Vivek Guo

Gupta Thanh-Le

Nizar Ha

Ivan Habash

Barry Habernal

Xianpei Haddow

Daniel Han

Peter Hardt

Di Hase

Michael He

Behnam Heck

Peter Hedayatnia

Daniel Heeman

Jack Hershcovich

Ryuichiro Hes-sel

Julia Higashinaka

Enamul Hockenmaier

Andreas Hoque

Yufang Hotho

Hou Dirk

Kristen Hovy

Di Howell

Xuming Hu

Fei Hu

Jie Huang

Lifu Huang

Peijie Huang

Shaohan Huang

Shujian Huang

Xuanjing Huang

Zhenzhen Huang

Mika Huang

Hämäläinen Kentaro

Inui Kokil

Hyeju Jaidka

Mustafa Jang

Yangfeng Jarrar

Lifeng Ji

Mali Jin

Qin Jin

Richard Jin

David Johansson

Preethi Jurgens

Jyothi Ehsan

Diptesh Kamalloo

S. Kanojia

Sarvnaz Kar

Pei Karimi

Daniel Ke

So-pan Khashabi

Tushar Khosla

Hyounghun Khot

Jin-Dong Kim

Joo-Kyung Kim

Taeuk Kim

Kim Roman

Rebecca Klinger

Ivan Knowles

Ekaterina Kobyzev

Philipp Kochmar

Koehn Mamoru

Rik Komachi

Lingpeng Koncel-Kedziorski

Julia Kong

Amrith Kreutzer

Kal-pesh Krishna

Udo Krishna

Artur Kruschwitz

Adhiguna Kulmizev

Kuncoro Wai

Gerasimos Lam

Mirella Lampouras

Staffan Lapata

Mark Larsson

Ivano Last

Lauriola Thu

Dong-Ho Le

Hwanhee Lee

Jinhyuk Lee

Mark G Lee

SangKeun Lee

Oliver Lee

Heather Le-mon

Piyawat Lent

Gina-Anne Lertvittayakumjorn

Miryam Levow

Bing de Lhoneux

Chuyuan Li

Dong Li

Jing Li

Junhui Li

Juntao Li

Li Li

Peng Li

Piji Li

Sujian Li

Li Tao

Wenjie Li

Xin Li

Yongbin Li

Yu Li

Yufei Li

Zhifei Li

Constantine Li

Chenghua Lignos

Hongyu Lin

Robert Lin

Bing Litschko

H. Liu

Kang Liu

Ming Liu

Qianying Liu

Tin-gwen Liu

Xuebo Liu

Yang Liu

Zhiyuan Liu

Zoey Liu

Ximing Liu

Anh Tuan Lu

Luu Chenyang

Lyu Ji

Jing Ma

Ruotian Ma

Xiaojuan Ma

Aman Ma

Harish Tayyar Madaan

Andrea Madabushi

Navonil Ma-dotto

Prodromos Majumder

Shervin Malakasiotis

Yuning Malmasi

Kelly Mao

Vukosi Marchi-sio

Stella Marivate

Lara J Markantonatou

Bruno Martin

Yuval Martins

Sérgio Marton

Yuji Matos

Julian Matsumoto

Bryan McAuley

Ryan McCann

Kathleen McDonald

McKeown Mahnoosh

Yuxian Mehrabani

Samuel Meng

Timothee Mensah

Margot Mickus

Simon Mieskes

Yasuhide Mille

Makoto Miura

Daichi Miwa

David R Mochihashi

Lili Mortensen

Kha-lil Mou

Benjamin Mrini

Philippe Muller

Smaranda Muller

Rudra Muresan

Thomas Murthy

Müller Max

Müller-Eberstein Maria

Nona Nadejde

Mikio Naderi

Hideki Nakano

Linyong Nakayama

Nan

Franco Maria

Tapas Nardini

Mark-Jan Nayak

Isar Nederhof

Mariana Nejadgholi

Dat Quoc Neves

Nguyen Le-Minh

Thien Huu Nguyen

Vahid Nguyen

Partovi Nia

Jan Niehues

Qiang Ning

Maciej Ogrodniczuk

Alice Oh

Naoaki Okazaki

Manabu Okumura

Matan Orbach

Nedjma Ou-sidhoum

Vasile Pais

Nikolaos Pappas

Joonsuk Park

Yannick Parmentier

Prasannan Parthasarathi

Lucia Passaro

Ramakanth Pasunuru

Siddharth Patwardhan

Hao Peng

Lis Pereira

Laura Perez-Beltrachini

Maxime Peyrard

Jonas Pfeiffer

Bryan A. Plummer

Maja Popovic

Soujanya Poria

Daniel Preotiuc-Pietro

Emily Prud'hommeaux

Vikram Pudi

Peng Qian

Tieyun Qian

Deepak Ramachandran

Carlos Ramisch

Leonardo Ranaldi

Sudha Rao

Shauli Ravfogel

Marek Rei

Leonardo F. R. Ribeiro

Oleg Rokhlenko

Salvatore Romeo

Joseph Le Roux

Alla Rozov-skaya

Terry Ruas

Raphael Rubino

Ivan Vladimir Meza Ruiz

Maria Ryskina

Hassan Sajjad

Shubhra Kanti

Karmaker Santu

Maarten Sap

Naomi Saphra

Asad B. Sayeed

Dominik Schlechtweg

Viktor Schlegel

Natalie Schluter

Nathan Schneider

Hinrich Schuetze

H. Schwartz

Jingbo Shang

Vasu Sharma

Tianze Shi

Mohammad Shoeybi

Lei Shu

Melanie Siegel Maneesh

Kumar Singh

Pranaydeep Singh

Sunayana Sitaram

Kevin Small

Luca Soldaini

Aina Garí Soler

Wei Song

Xingyi Song

Yan Song

Jeffrey S. Sorensen

Aitor Soroa

Jacopo Staiano

Efstathios Stamatatos

Gabriel Stanovsky

Shane Steinert-Threlkeld

Jannik Strötgen

Sara Stymne

Jinsong Su

Saku Sugawara

Alessandro Suglia

Aixin Sun

Cheng-jie Sun

Kai Sun

György Szarvas

Víctor M. Sánchez-Cartagena

Gözde Gül ¸Sahin

Zeerak Talat

Chenhao Tan

Hao Tan

Tianyi Tang

Jesse Thomason

Brian Thompson

Yuanhe Tian

Zhiliang Tian

Amalia Todirascu

Sara Tonelli

Paolo Torroni

Kristina Toutanova

Amine xv Trabelsi

Trang Tran

David R. Traum

Kewei Tu

Martin Tutek

Ana Sabina Uban

Takehito Utsuro

Olga Vechtomova

Yannick Versley

Karin M. Verspoor

David Vilar

David Vilares 0001

Serena Villa-ta

Esaú Villatoro-Tello

Thuy Vu

Ivan Vuli´c

Fei Xia

Tong Xiao

Bo Xu

Huijuan Xu

Nianwen Xue

S. Yadav

Hang Yan

Rui Yan

Min Yang

Wei Yang

Yezhou Yang

Yi Yang

Zhenglu Yang

Jin-Ge Yao

Wei Ye

Yongjing Yin

Naoki Yoshinaga

Koichiro Yoshino

Jianfei Yu

Juntao Yu Mo

Yu Manzil Zaheer

Fabio Massimo Zanzotto

Weixin Zeng

Luke Zettlemoyer

Biao Zhang

Chen Zhang

Crystina Zhang

Jiajun Zhang

Jingyi Zhang

Justine Zhang

Meishan Zhang

Ningyu Zhang

Shaolei Zhang

Sheng Zhang

Shiyue Zhang

Shuai Zhang

Shuo Zhang

Wei Zhang

Yang Zhang

Zhe Zhang

Jieyu Zhao

Shiwan Zhao

Hai-Tao Zheng

Zaixiang Zheng

Jie Zhou

Yi Zhou

Xiaodan Zhu

2023-12-31

NAACL-HLT (Findings) (published)

dblp.uni-trier.de

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

Maciej Wolczyk

Bartłomiej Cupiał

Mateusz Ostaszewski

Michal Bortkiewicz

Michał Zając

Razvan Pascanu

Lukasz Kuci'nski

Piotr Milo's

Fine-tuning is a widespread technique that allows practitioners to transfer pre-trained capabilities, as recently showcased by the successfu… (see more)l applications of foundation models. However, fine-tuning reinforcement learning (RL) models remains a challenge. This work conceptualizes one specific cause of poor transfer, accentuated in the RL setting by the interplay between actions and observations: forgetting of pre-trained capabilities. Namely, a model deteriorates on the state subspace of the downstream task not visited in the initial phase of fine-tuning, on which the model behaved well due to pre-training. This way, we lose the anticipated transfer benefits. We identify conditions when this problem occurs, showing that it is common and, in many cases, catastrophic. Through a detailed empirical analysis of the challenging NetHack and Montezuma’s Revenge environments, we show that standard knowledge retention techniques mitigate the problem and thus allow us to take full advantage of the pre-trained capabilities. In particular, in NetHack, we achieve a new state-of-the-art for neural models, improving the previous best score from

2023-12-31

ICML (published)

doi.org

proceedings.mlr.press

Fisher Flow Matching for Generative Modeling over Discrete Data

Oscar Davis

Samuel Kessler

Mircea Petrache

.Ismail .Ilkan Ceylan

Michael M. Bronstein

Avishek Bose

Generative modeling over discrete data has recently seen numerous success stories, with applications spanning language modeling, biological … (see more)sequence design, and graph-structured molecular data. The predominant generative modeling paradigm for discrete data is still autoregressive, with more recent alternatives based on diffusion or flow-matching falling short of their impressive performance in continuous data settings, such as image or video generation. In this work, we introduce Fisher-Flow, a novel flow-matching model for discrete data. Fisher-Flow takes a manifestly geometric perspective by considering categorical distributions over discrete data as points residing on a statistical manifold equipped with its natural Riemannian metric: the

2023-12-31

NeurIPS (published)

doi.org

arxiv.org

A Foundation Model for Zero-shot Logical Query Reasoning

Mikhail Galkin

Jincheng Zhou

Bruno Ribeiro

Jian Tang

Zhaocheng Zhu

Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional querie… (see more)s comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, the first foundation model for inductive reasoning that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG. With the projection operation initialized from a pre-trained inductive KG reasoning model, UltraQuery can solve CLQA on any KG after finetuning on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 15 of them.

2023-12-31

NeurIPS (published)

doi.org

openreview.net

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Usman Anwar

Abulhair Saparov

Javier Rando

Daniel Paleka

Miles Turpin

Peter Hase

Ekdeep Singh Lubana

Erik Jenner

Stephen Casper

Oliver Sourbut

Benjamin L. Edelman

Zhaowei Zhang

Mario Günther

Anton Korinek

Jose Hernandez-Orallo

Lewis Hammond

Eric Bigelow

Alexander Pan

Lauro Langosco

Tomasz Korbak … (see 22 more)

Heidi Zhang

Ruiqi Zhong

Seán Ó hÉigeartaigh

Gabriel Recchia

Giulio Corsi

Alan Chan

Markus Anderljung

Lilian Edwards

Aleksandar Petrov

Christian Schroeder de Witt

Sumeet Ramesh Motwani

Samuel Albanie

Yoshua Bengio

Danqi Chen

Philip H.S. Torr

Tegan Maharaj

Jakob Foerster

Florian Tramèr

He He

Atoosa Kasirzadeh

Yejin Choi

David Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are o… (see more)rganized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose

2023-12-31

Trans. Mach. Learn. Res. (published)

doi.org

openreview.net

fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models

Weijia Xu

Nebojsa Jojic

Nicolas Roux

2023-12-31

arXiv.org (preprint)

doi.org

openreview.net