Publications

Exploring Quantization for Efficient Pre-Training of Transformer Language Models
Kamran Chitsaz
A. Chandar
The increasing scale of Transformer models has led to an increase in their pre-training computational requirements. While quantization has p… (see more)roven to be effective after pre-training and during fine-tuning, applying quantization in Transformers during pre-training has remained largely unexplored at scale for language modeling. This study aims to explore the impact of quantization for efficient pre-training of Transformers, with a focus on linear layer components. By systematically applying straightforward linear quantization to weights, activations, gradients, and optimizer states, we assess its effects on model efficiency, stability, and performance during training. By offering a comprehensive recipe of effective quantization strategies to be applied during the pre-training of Transformers, we promote high training efficiency from scratch while retaining language modeling ability. Code is available at https://github.com/chandar-lab/EfficientLLMs.
Exploring the digital divide: results of a survey informing mobile application development
Maira Corinne Claudio
Zachary Rehany
Katerina Stachtari
Elena Guadagno
Esli Osmanlliu
Introduction Mobile health apps risk widening health disparities if they overlook digital inclusion. The digital divide, encompassing access… (see more), familiarity, and readiness, poses a significant barrier to medical interventions. Existing literature lacks exploration of the digital divide's contributing factors. Hence, data are needed to comprehend the challenges in developing inclusive health apps. Methods We created a survey to gauge internet and smartphone access, smartphone familiarity, and readiness for using mobile health apps among caregivers of pediatric patients in tertiary care. Open-ended questions solicited feedback and suggestions on mobile health applications. Responses were categorized by similarity and compared. Developed with patient partners, the survey underwent cognitive testing and piloting for accuracy. Results Data from 209 respondents showed that 23% were affected by the digital divide, mainly due to unfamiliarity with digital skills. Among 49 short text responses about health app concerns, 31 mentioned security and confidentiality, with 7 mentioning the impersonal nature of such apps. Desired features included messaging healthcare providers, scheduling, task reminders, and simplicity. Conclusions This study underscores a digital divide among caregivers of pediatric patients, with nearly a quarter affected primarily due to a lack of digital comfort. Respondents emphasized user-friendliness and online security for health apps. Future apps should prioritize digital inclusion by addressing the significant barriers and carefully considering patient and family concerns.
Fairness Through Domain Awareness: Mitigating Popularity Bias For Music Discovery
As online music platforms grow, music recommender systems play a vital role in helping users navigate and discover content within their vast… (see more) musical databases. At odds with this larger goal, is the presence of popularity bias, which causes algorithmic systems to favor mainstream content over, potentially more relevant, but niche items. In this work we explore the intrinsic relationship between music discovery and popularity bias. To mitigate this issue we propose a domain-aware, individual fairness-based approach which addresses popularity bias in graph neural network (GNNs) based recommender systems. Our approach uses individual fairness to reflect a ground truth listening experience, i.e., if two songs sound similar, this similarity should be reflected in their representations. In doing so, we facilitate meaningful music discovery that is robust to popularity bias and grounded in the music domain. We apply our BOOST methodology to two discovery based tasks, performing recommendations at both the playlist level and user level. Then, we ground our evaluation in the cold start setting, showing that our approach outperforms existing fairness benchmarks in both performance and recommendation of lesser-known content. Finally, our analysis explains why our proposed methodology is a novel and promising approach to mitigating popularity bias and improving the discovery of new and niche content in music recommender systems.
Fairness Under Demographic Scarce Regime
Patrik Joslin Kenfack
S Ebrahimi Kahou
Ulrich Matchi Aïvodji
Most existing works on fairness assume the model has full access to demographic information. However, there exist scenarios where demographi… (see more)c information is partially available because a record was not maintained throughout data collection or due to privacy reasons. This setting is known as demographic scarce regime. Prior research have shown that training an attribute classifier to replace the missing sensitive attributes (proxy) can still improve fairness. However, the use of proxy-sensitive attributes worsens fairness-accuracy trade-offs compared to true sensitive attributes. To address this limitation, we propose a framework to build attribute classifiers that achieve better fairness-accuracy trade-offs. Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes is detrimental to fairness and accuracy. Our experiments on two datasets showed that the proposed framework yields models with significantly better fairness-accuracy trade-offs compared to classic attribute classifiers. Surprisingly, our framework outperforms models trained with constraints on the true sensitive attributes.
Findings of the 2nd Shared Task on Multi-lingual Multi-task Information Retrieval at MRL 2024
Francesco Tinner
Raghav Mantri
Mammad Hajili
Chiamaka Ijeoma Chukwuneke
Dylan Massey
Benjamin A. Ajibade
Bilge Kocak
Abolade Dawud
Jonathan Atala
Hale Sirin
Kayode Olaleye
Anar Rzayev
Duygu Ataman
Large language models (LLMs) demonstrate exceptional proficiency in both the comprehension and generation of textual data, particularly in E… (see more)nglish, a language for which extensive public benchmarks have been established across a wide range of natural language processing (NLP) tasks. Nonetheless, their performance in multilingual contexts and specialized domains remains less rigorously validated, raising questions about their reliability and generalizability across linguistically diverse and domain-specific settings. The second edition of the Shared Task on Multilingual Multitask Information Retrieval aims to provide a comprehensive and inclusive multilingual evaluation benchmark which aids assessing the ability of multilingual LLMs to capture logical, factual, or causal relationships within lengthy text contexts and generate language under sparse settings, particularly in scenarios with under-resourced languages. The shared task consists of two subtasks crucial to information retrieval: Named entity recognition (NER) and reading comprehension (RC), in 7 data-scarce languages: Azerbaijani, Swiss German, Turkish and , which previously lacked annotated resources in information retrieval tasks. This year specifally focus on the multiple-choice question answering evaluation setting which provides a more objective setting for comparing different methods across languages.
Findings of the Association for Computational Linguistics: NAACL 2024, Mexico City, Mexico, June 16-21, 2024
Mohamed Abdalla
Gavin Abercrombie
Rodrigo Agerri
Zeljko Agic
Eneko Agirre
Monica Agrawal
Wasi Uddin Ahmad
James Allan
Aijun An
Antonios Anasta-sopoulos
Mark Anderson
Jacob Andreas
Marianna Apidianaki
Alessio Palmero
Yuki Aprosio
Ehsaneddin Arase
Giuseppe Asgari
Wilker Attardi
Aziz JinYeong … (see 480 more)
Timothy Bak
Mohamad Hardyman Baldwin
Pierpaolo Barawi
Ali Basile
Ja-smijn Basirat
Timo Bastings
Gábor Baumann
Eyal Bella
Farah Ben-David
Luciana Benamara
Benotti Yevgeni
Brijesh Berzak
Federico Bhatt
Chris Bianchi
Lidong Biemann
Alexandra Bing
Birch Eduardo
Gemma Blanco
Aurélien Boleda
Florian Bossard
Leonid Boudin
Ronan Boytsov
Pavel Le Bras
Chris Braslavski
Eleftheria Brew
Thomas Briakou
Emanuele Brochhagen
Wray Buglia-rello
Buntine Elena
Aoife Cabrio
Ruken Cahill
Jose Cakici
Marie Camacho-Collados
Pengfei Candito
Ziqiang Cao
Dallas Cao
Paula Card
Tommaso Carvalho
Andrew Caselli
Tanmoy Cattle
Ilias Chakrabor-ty
Angel X Chalkidis
Ching-Yun Chang
Snigdha Chang
Chen Chaturvedi
Kehai Chen
Long Chen
Lu Chen
Muhao Chen
Wei Chen
Wenhu Chen
Wenliang Chen
Xiang Chen
Yidong Chen
Yun-Nung Chen
Zhiyu Chen
Zhuang Chen
Hao Chen
Yu Cheng
Colin Cheng
Cherry Hai
Eunsol Leong Chieu
Leshem Choi
Monojit Choshen
Christos Choudhury
Yi-Ling Christodoulopou-los
Stephen Chung
Vincent Clark
Simone Claveau
John M Conia
Caio Filippo Conroy
Mathias Corro
Leyang Creutz
Aron Cui
Anna E Culotta
Amanda Cercas Currey
Curry Raj
Daniel Dabre
Cristian Dakota
Verna Danescu-Niculescu-Mizil
Budhaditya Dankers
Deb Vera
Zhenyun Demberg
Li Deng
Ruihai Dong
Antoine Dong
Eduard Doucet
Nan Dragut
Kevin Duan
Greg Duh
Ondrej Durrett
Tomasz Dusek
Dwojak Julian Martin
Asif Eisenschlos
Yanai Ekbal
Cristina Elazar
Luis España-Bonet
Espinosa-Anke Allyson
Kilian Ettinger
Evang Alexander
Agnieszka Fabbri
Meng Falenska
Marcello Fang
Hao Federico
Anna Fei
Feldman Naomi
Fuli Feldman
Xiaocheng Feng
Yansong Feng
Eric Feng
Francis Le Ferrand
Eli-sabetta Ferraro
Simone Fersini
Mark Filice
Mark Finlayson
Jennifer Fishel
Annemarie Foster
Friedrich Matthias
Zhe Gallé
Siddhant Gan
Judith Garg
Kallirroi Gaspers
Alborz Georgila
Geramifard Luke
Mor Gessler
Abbas Geva
Sahar Ghaddar
Filip Ghannay
Mario Ginter
Tejas Giulianelli
Sharon Gokhale
Rob Goldwater
Kyle van der Goot
Tanya Gorman
Jia-Chen Goyal
Qing-Wei Gu
Frank Gu
Lin Guerin
Honglei Gui
Qipeng Guo
Vivek Guo
Gupta Thanh-Le
Nizar Ha
Ivan Habash
Barry Habernal
Xianpei Haddow
Daniel Han
Peter Hardt
Di Hase
Michael He
Behnam Heck
Peter Hedayatnia
Daniel Heeman
Jack Hershcovich
Ryuichiro Hes-sel
Julia Higashinaka
Enamul Hockenmaier
Andreas Hoque
Yufang Hotho
Hou Dirk
Kristen Hovy
Di Howell
Xuming Hu
Fei Hu
Jie Huang
Lifu Huang
Peijie Huang
Shaohan Huang
Shujian Huang
Xuanjing Huang
Zhenzhen Huang
Mika Huang
Hämäläinen Kentaro
Inui Kokil
Hyeju Jaidka
Mustafa Jang
Yangfeng Jarrar
Lifeng Ji
Mali Jin
Qin Jin
Richard Jin
David Johansson
Preethi Jurgens
Jyothi Ehsan
Diptesh Kamalloo
S. Kanojia
Sarvnaz Kar
Pei Karimi
Daniel Ke
So-pan Khashabi
Tushar Khosla
Hyounghun Khot
Jin-Dong Kim
Joo-Kyung Kim
Taeuk Kim
Kim Roman
Rebecca Klinger
Ivan Knowles
Ekaterina Kobyzev
Philipp Kochmar
Koehn Mamoru
Rik Komachi
Lingpeng Koncel-Kedziorski
Julia Kong
Amrith Kreutzer
Kal-pesh Krishna
Udo Krishna
Artur Kruschwitz
Adhiguna Kulmizev
Kuncoro Wai
Gerasimos Lam
Mirella Lampouras
Staffan Lapata
Mark Larsson
Ivano Last
Lauriola Thu
Dong-Ho Le
Hwanhee Lee
Jinhyuk Lee
Mark G Lee
SangKeun Lee
Oliver Lee
Heather Le-mon
Piyawat Lent
Gina-Anne Lertvittayakumjorn
Miryam Levow
Bing de Lhoneux
Chuyuan Li
Dong Li
Jing Li
Junhui Li
Juntao Li
Li Li
Peng Li
Piji Li
Sujian Li
Li Tao
Wenjie Li
Xin Li
Yongbin Li
Yufei Li
Zhifei Li
Constantine Li
Chenghua Lignos
Hongyu Lin
Robert Lin
Bing Litschko
H. Liu
Kang Liu
Ming Liu
Qianying Liu
Tin-gwen Liu
Xuebo Liu
Yang Liu
Zhiyuan Liu
Zoey Liu
Ximing Liu
Anh Tuan Lu
Luu Chenyang
Lyu Ji
Jing Ma
Ruotian Ma
Xiaojuan Ma
Aman Ma
Harish Tayyar Madaan
Andrea Madabushi
Navonil Ma-dotto
Prodromos Majumder
Shervin Malakasiotis
Yuning Malmasi
Kelly Mao
Vukosi Marchi-sio
Stella Marivate
Lara J Markantonatou
Bruno Martin
Yuval Martins
Sérgio Marton
Yuji Matos
Julian Matsumoto
Bryan McAuley
Ryan McCann
Kathleen McDonald
McKeown Mahnoosh
Yuxian Mehrabani
Samuel Meng
Timothee Mensah
Margot Mickus
Simon Mieskes
Yasuhide Mille
Makoto Miura
Daichi Miwa
David R Mochihashi
Lili Mortensen
Kha-lil Mou
Benjamin Mrini
Philippe Muller
Smaranda Muller
Rudra Muresan
Thomas Murthy
Müller Max
Müller-Eberstein Maria
Nona Nadejde
Mikio Naderi
Hideki Nakano
Linyong Nakayama
Nan
Franco Maria
Tapas Nardini
Mark-Jan Nayak
Isar Nederhof
Mariana Nejadgholi
Dat Quoc Neves
Nguyen Le-Minh
Vahid Nguyen
Partovi Nia
Jan Niehues
Qiang Ning
Maciej Ogrodniczuk
Alice Oh
Naoaki Okazaki
Manabu Okumura
Matan Orbach
Nedjma Ou-sidhoum
Vasile Pais
Nikolaos Pappas
Joonsuk Park
Yannick Parmentier
Prasannan Parthasarathi
Lucia Passaro
Ramakanth Pasunuru
Siddharth Patwardhan
Hao Peng
Lis Pereira
Laura Perez-Beltrachini
Maxime Peyrard
Jonas Pfeiffer
Bryan A. Plummer
Maja Popovic
Soujanya Poria
Daniel Preotiuc-Pietro
Emily Prud'hommeaux
Vikram Pudi
Peng Qian
Tieyun Qian
Deepak Ramachandran
Carlos Ramisch
Leonardo Ranaldi
Sudha Rao
Shauli Ravfogel
Marek Rei
Leonardo F. R. Ribeiro
Oleg Rokhlenko
Salvatore Romeo
Joseph Le Roux
Alla Rozov-skaya
Terry Ruas
Raphael Rubino
Ivan Vladimir Meza Ruiz
Maria Ryskina
Hassan Sajjad
Shubhra Kanti
Karmaker Santu
Maarten Sap
Naomi Saphra
Asad B. Sayeed
Dominik Schlechtweg
Viktor Schlegel
Natalie Schluter
Nathan Schneider
Hinrich Schuetze
H. Schwartz
Jingbo Shang
Vasu Sharma
Tianze Shi
Mohammad Shoeybi
Lei Shu
Melanie Siegel Maneesh
Kumar Singh
Pranaydeep Singh
Sunayana Sitaram
Kevin Small
Luca Soldaini
Aina Garí Soler
Wei Song
Xingyi Song
Yan Song
Jeffrey S. Sorensen
Aitor Soroa
Jacopo Staiano
Efstathios Stamatatos
Gabriel Stanovsky
Shane Steinert-Threlkeld
Jannik Strötgen
Sara Stymne
Jinsong Su
Saku Sugawara
Alessandro Suglia
Aixin Sun
Cheng-jie Sun
Kai Sun
György Szarvas
Víctor M. Sánchez-Cartagena
Gözde Gül ¸Sahin
Zeerak Talat
Chenhao Tan
Hao Tan
Tianyi Tang
Jesse Thomason
Brian Thompson
Yuanhe Tian
Zhiliang Tian
Amalia Todirascu
Sara Tonelli
Paolo Torroni
Kristina Toutanova
Amine xv Trabelsi
Trang Tran
David R. Traum
Kewei Tu
Martin Tutek
Ana Sabina Uban
Takehito Utsuro
Olga Vechtomova
Yannick Versley
Karin M. Verspoor
David Vilar
David Vilares 0001
Serena Villa-ta
Esaú Villatoro-Tello
Thuy Vu
Ivan Vuli´c
Fei Xia
Tong Xiao
Bo Xu
Huijuan Xu
Nianwen Xue
S. Yadav
Hang Yan
Rui Yan
Min Yang
Wei Yang
Yezhou Yang
Yi Yang
Zhenglu Yang
Jin-Ge Yao
Wei Ye
Yongjing Yin
Naoki Yoshinaga
Koichiro Yoshino
Jianfei Yu
Juntao Yu Mo
Yu Manzil Zaheer
Fabio Massimo Zanzotto
Weixin Zeng
Luke Zettlemoyer
Biao Zhang
Chen Zhang
Crystina Zhang
Jiajun Zhang
Jingyi Zhang
Justine Zhang
Meishan Zhang
Ningyu Zhang
Shaolei Zhang
Sheng Zhang
Shiyue Zhang
Shuai Zhang
Shuo Zhang
Wei Zhang
Yang Zhang
Zhe Zhang
Shiwan Zhao
Hai-Tao Zheng
Zaixiang Zheng
Jie Zhou
Yi Zhou
Xiaodan Zhu
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Maciej Wolczyk
Bartłomiej Cupiał
Mateusz Ostaszewski
Michal Bortkiewicz
Michał Zając
Lukasz Kuci'nski
Piotr Milo's
Fine-tuning is a widespread technique that allows practitioners to transfer pre-trained capabilities, as recently showcased by the successfu… (see more)l applications of foundation models. However, fine-tuning reinforcement learning (RL) models remains a challenge. This work conceptualizes one specific cause of poor transfer, accentuated in the RL setting by the interplay between actions and observations: forgetting of pre-trained capabilities. Namely, a model deteriorates on the state subspace of the downstream task not visited in the initial phase of fine-tuning, on which the model behaved well due to pre-training. This way, we lose the anticipated transfer benefits. We identify conditions when this problem occurs, showing that it is common and, in many cases, catastrophic. Through a detailed empirical analysis of the challenging NetHack and Montezuma’s Revenge environments, we show that standard knowledge retention techniques mitigate the problem and thus allow us to take full advantage of the pre-trained capabilities. In particular, in NetHack, we achieve a new state-of-the-art for neural models, improving the previous best score from
Fisher Flow Matching for Generative Modeling over Discrete Data
Oscar Davis
Samuel Kessler
Mircea Petrache
.Ismail .Ilkan Ceylan
Michael M. Bronstein
Avishek Bose
Generative modeling over discrete data has recently seen numerous success stories, with applications spanning language modeling, biological … (see more)sequence design, and graph-structured molecular data. The predominant generative modeling paradigm for discrete data is still autoregressive, with more recent alternatives based on diffusion or flow-matching falling short of their impressive performance in continuous data settings, such as image or video generation. In this work, we introduce Fisher-Flow, a novel flow-matching model for discrete data. Fisher-Flow takes a manifestly geometric perspective by considering categorical distributions over discrete data as points residing on a statistical manifold equipped with its natural Riemannian metric: the
A Foundation Model for Zero-shot Logical Query Reasoning
Jincheng Zhou
Bruno Ribeiro
Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional querie… (see more)s comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, the first foundation model for inductive reasoning that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG. With the projection operation initialized from a pre-trained inductive KG reasoning model, UltraQuery can solve CLQA on any KG after finetuning on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 15 of them.
Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Usman Anwar
Abulhair Saparov
Javier Rando
Daniel Paleka
Miles Turpin
Peter Hase
Ekdeep Singh Lubana
Erik Jenner
Stephen Casper
Oliver Sourbut
Benjamin L. Edelman
Zhaowei Zhang
Mario Günther
Anton Korinek
Jose Hernandez-Orallo
Lewis Hammond
Eric Bigelow
Alexander Pan
Lauro Langosco
Tomasz Korbak … (see 22 more)
Heidi Zhang
Ruiqi Zhong
Seán Ó hÉigeartaigh
Gabriel Recchia
Giulio Corsi
Markus Anderljung
Lilian Edwards
Aleksandar Petrov
Christian Schroeder de Witt
Sumeet Ramesh Motwani
Samuel Albanie
Danqi Chen
Philip H.S. Torr
Jakob Foerster
Florian Tramèr
He He
Atoosa Kasirzadeh
Yejin Choi
David Krueger
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are o… (see more)rganized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose
fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models
Weijia Xu
Nebojsa Jojic
Nicolas Roux
A framework for fair decision-making over time with time-invariant utilities
Andrea Lodi
Sriram Sankaranarayanan
Guanyi Wang