David Ifeoluwa Adelani

Biographie

David Adelani est professeur adjoint en science informatique et lutte contre les inégalités à l’Université McGill, et membre académique principal à Mila – Institut québécois d'intelligence artificielle. Ses recherches se concentrent sur le traitement multilingue du langage naturel, avec un accent particulier sur les langues sous-dotées en ressources.

Étudiants actuels

Jonah Dauvet

Stagiaire de recherche - McGill

Senyu Li Li

Doctorat - McGill

Sifan Liu

Stagiaire de recherche - McGill

Jessica Ojo

Maîtrise recherche - McGill

Fabian Schmidt

Collaborateur·rice alumni - McGill

McGill

Vivek Verma

Maîtrise professionnelle - UdeM

tianyi.xu2@mail.mcgill.ca

Tianyi Xu Xu

Stagiaire de recherche - McGill

Peter Yu

Maîtrise recherche - McGill

Site web

Publications

Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages

Edward Bayes

Israel Abebe Azime

Jesujoba Oluwadara Alabi

Jonas Kgomo

Tyna Eloundou

Elizabeth Proehl

Kai Chen

Imaan Khadir

Naome Etori

Shamsuddeen Hassan Muhammad

Choice Mpanza

Igneciah Pocia Thete

Dietrich Klakow

Evaluations of Large Language Models (LLMs) on knowledge-intensive tasks and factual accuracy often focus on high-resource languages primari… (voir plus)ly because datasets for low-resource languages (LRLs) are scarce. In this paper, we present Uhura -- a new benchmark that focuses on two tasks in six typologically-diverse African languages, created via human translation of existing English benchmarks. The first dataset, Uhura-ARC-Easy, is composed of multiple-choice science questions. The second, Uhura-TruthfulQA, is a safety benchmark testing the truthfulness of models on topics including health, law, finance, and politics. We highlight the challenges creating benchmarks with highly technical content for LRLs and outline mitigation strategies. Our evaluation reveals a significant performance gap between proprietary models such as GPT-4o and o1-preview, and Claude models, and open-source models like Meta's LLaMA and Google's Gemma. Additionally, all models perform better in English than in African languages. These results indicate that LMs struggle with answering scientific questions and are more prone to generating false claims in low-resource African languages. Our findings underscore the necessity for continuous improvement of multilingual LM capabilities in LRL settings to ensure safe and reliable use in real-world contexts. We open-source the Uhura Benchmark and Uhura Platform to foster further research and development in NLP for LRLs.

2024-12-01

ArXiv (prépublication)

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

Genta Indra Winata

Frederikus Hudi

Patrick Amadeus Irawan

David Anugraha

Rifki Afina Putri

Yutong Wang

Adam Nohejl

Ubaidillah Ariq Prathama

Nedjma OUSIDHOUM

Afifa Amriani

Anar Rzayev

Anirban Das

Ashmari Pramodya

Aulia Adila

Bryan Wilie

Candy Olivia Mawalim

Ching Lam Cheng

Daud Abolade

Emmanuele Chersoni

Enrico Santus … (voir 31 de plus)

Fariz Ikhwantri

Garry Kuwanto

Hanyang Zhao

Haryo Akbarianto Wibowo

Holy Lovenia

Jan Christian Blaise Cruz

Jan Wira Gotama Putra

Junho Myung

Lucky Susanto

Maria Angelica Riera Machin

Marina Zhukova

Michael Anugraha

Muhammad Farid Adilazuarda

Natasha Santosa

Peerat Limkonchotiwat

Raj Dabre

Rio Alexander Audino

Samuel Cahyawijaya

Shi-Xiong Zhang

Stephanie Yulia Salim

Yi Zhou

Yinxuan Gui

En-Shiun Annie Lee

Shogo Okada

Ayu Purwarianti

Alham Fikri Aji

Taro Watanabe

Derry Tanti Wijaya

Alice Oh

Chong-Wah Ngo

2024-10-16

ArXiv (prépublication)

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

Genta Indra Winata

Frederikus Hudi

Patrick Amadeus Irawan

David Anugraha

Rifki Afina Putri

Yutong Wang

Adam Nohejl

Ubaidillah Ariq Prathama

Nedjma OUSIDHOUM

Afifa Amriani

Anar Rzayev

Anirban Das

Ashmari Pramodya

Aulia Adila

Bryan Wilie

Candy Olivia Mawalim

Ching Lam Cheng

Daud Abolade

Emmanuele Chersoni

Enrico Santus … (voir 31 de plus)

Fariz Ikhwantri

Garry Kuwanto

Hanyang Zhao

Haryo Akbarianto Wibowo

Holy Lovenia

Jan Christian Blaise Cruz

Jan Wira Gotama Putra

Junho Myung

Lucky Susanto

Maria Angelica Riera Machin

Marina Zhukova

Michael Anugraha

Muhammad Farid Adilazuarda

Natasha Santosa

Peerat Limkonchotiwat

Raj Dabre

Rio Alexander Audino

Samuel Cahyawijaya

Shi-Xiong Zhang

Stephanie Yulia Salim

Yi Zhou

Yinxuan Gui

En-Shiun Annie Lee

Shogo Okada

Ayu Purwarianti

Alham Fikri Aji

Taro Watanabe

Derry Tanti Wijaya

Alice Oh

Chong-Wah Ngo

2024-10-16

ArXiv (prépublication)

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark

David LE MEUR

David Orlando Romero Mogrovejo

Chenyang Lyu

Haryo Akbarianto Wibowo

Teresa Lynn

Injy Hamed

Aditya Nanda Kishore Khandavally

Aishik Mandal

Alina Dragonetti

Artem Abzaliev

Atnafu Lambebo Tonja

Bontu Fufa Balcha

Chenxi Whitehouse

Christian Salamea-Palacios

Dan John Velasco

D. Meur

Emilio Villa Cueva

Fajri Koto

Fauzan Farooqui … (voir 57 de plus)

Frederico Belcavello

Ganzorig Batnasan

Gisela Vallejo

Gráinne Caulfield

Guido Ivetta

Haiyue Song

Henok Biadglign Ademtew

Hernán Maina

Holy Lovenia

Israel Abebe Azime

Jan Christian Blaise Cruz

Jay Gala

Jiahui Geng

Jesus-German Ortiz-Barajas

Jinheon Baek

Jocelyn Dunstan

Laura Alonso Alemany

Teresa Clifford

Kumaranage Ravindu Yasas Nagasinghe

Luciana Benotti

Luis Fernando D'Haro

Marcelo Viridiano

Marcos Estecha-Garitagoitia

Maria Camila Buitrago Cabrera

Mario Rodríguez-Cantelar

Mélanie Jouitteau

Mihail Minkov Mihaylov

Mohamed Fazli Mohamed Imam

Muhammad Farid Adilazuarda

Munkhjargal Gochoo

Munkh-Erdene Otgonbold

Naome Etori

Olivier NIYOMUGISHA

Paula Mónica Silva

Pranjal A Chitale

Raj Dabre

Rendi Chevi

Ruochen Zhang

Ryandito Diandaru

Samuel Cahyawijaya

Santiago Góngora

Soyeong Jeong

Sukannya Purkayastha

Tatsuki Kuribayashi

Thanmay Jayakumar

Tiago Timponi Torrent

Toqeer Ehsan

Vladimir Araujo

Yova Kementchedjhieva

Zara Burzo

Zheng Wei Lim

Zheng Xin Yong

Oana Ignat

Joan Nwatu

Rada Mihalcea

Thamar Solorio

Alham Fikri Aji

2024-09-26

NeurIPS.cc/2024/Datasets_and_Benchmarks_Track (présentation orale)

openreview.net

Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language Models

Kenza Benkirane

Laura Gongas

Shahar Pelles

Naomi Fuchs

Joshua Darmon

Pontus Stenetorp

Eduardo Sánchez

Meta

Recent advancements in massively multilingual machine translation systems have significantly enhanced translation accuracy; however, even th… (voir plus)e best performing systems still generate hallucinations, severely impacting user trust. Detecting hallucinations in Machine Translation (MT) remains a critical challenge, particularly since existing methods excel with High-Resource Languages (HRLs) but exhibit substantial limitations when applied to Low-Resource Languages (LRLs). This paper evaluates sentence-level hallucination detection approaches using Large Language Models (LLMs) and semantic similarity within massively multilingual embeddings. Our study spans 16 language directions, covering HRLs, LRLs, with diverse scripts. We find that the choice of model is essential for performance. On average, for HRLs, Llama3-70B outperforms the previous state of the art by as much as 0.16 MCC (Matthews Correlation Coefficient). However, for LRLs we observe that Claude Sonnet outperforms other LLMs on average by 0.03 MCC. The key takeaway from our study is that LLMs can achieve performance comparable or even better than previously proposed models, despite not being explicitly trained for any machine translation task. However, their advantage is less significant for LRLs.

2024-07-23

ArXiv (prépublication)

Mitigating Translationese in Low-resource Languages: The Storyboard Approach

Garry Kuwanto

Eno-Abasi Urua

Priscilla A. Amuok

Shamsuddeen Hassan Muhammad

Aremu Anuoluwapo

Verrah Akinyi Otiende

Loice Emma Nanyanga

T. Nyoike

A. D. Akpan

Nsima Ab Udouboh

Idongesit Udeme Archibong

Idara Effiong Moses

Ifeoluwatayo A. Ige

Benjamin A. Ajibade

Olumide Benjamin Awokoya

Idris Abdulmumin

Saminu Mohammad Aliyu

Ruqayya Nasir Iro

Ibrahim Ahmad

Deontae Smith … (voir 4 de plus)

Praise-EL Michaels

Derry Tanti Wijaya

Anietie U Andy

Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which… (voir plus) can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.

2024-07-14

ArXiv (prépublication)

Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects

Orevaoghene Ahia

Aremu Anuoluwapo

Diana Abagyan

Hila Gonen

Daud Abolade

Noah A. Smith

Yulia Tsvetkov

Yoruba—an African language with roughly 47 million speakers—encompasses a continuum with several dialects. Recent efforts to develop NLP… (voir plus) technologies for African languages have focused on their standard dialects, resulting in disparities for dialects and varieties for which there are little to no resources or tools. We take steps towards bridging this gap by introducing a new high-quality parallel text and speech corpus; YORULECT across three domains and four regional yoruba dialects. To develop this corpus, we engaged native speakers, traveling to communities where these dialects are spoken, to collect text and speech data. Using our newly created corpus, we conducted extensive experiments on (text) machine translation, automatic speech recognition, and speech-to-text translation. Our results reveal substantial performance disparities between standard yoruba and the other dialects across all tasks. However, we also show that with dialect-adaptive finetuning, we are able to narrow this gap. We believe our dataset and experimental analysis will contribute greatly to developing NLP tools for Yoruba and its dialects, and potentially for other African languages, by improving our understanding of existing challenges and offering a high-quality dataset for further development. We will release YORULECT dataset and models publicly under an open license.

2024-06-27

ArXiv (prépublication)

MINERS: Multilingual Language Models as Semantic Retrievers

Genta Indra Winata

Ruochen Zhang

Words have been represented in a high-dimensional vector space that encodes their semantic similarities, enabling downstream applications su… (voir plus)ch as retrieving synonyms, antonyms, and relevant contexts. However, despite recent advances in multilingual language models (LMs), the effectiveness of these models' representations in semantic retrieval contexts has not been comprehensively explored. To fill this gap, this paper introduces the MINERS, a benchmark designed to evaluate the ability of multilingual LMs in semantic retrieval tasks, including bitext mining and classification via retrieval-augmented contexts. We create a comprehensive framework to assess the robustness of LMs in retrieving samples across over 200 diverse languages, including extremely low-resource languages in challenging cross-lingual and code-switching settings. Our results demonstrate that by solely retrieving semantically similar embeddings yields performance competitive with state-of-the-art approaches, without requiring any fine-tuning.

2024-06-11

ArXiv (prépublication)

IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models

Jessica Ojo

Israel Abebe Azime

Zhuang Yun Jian

Jesujoba Oluwadara Alabi

Xuanli He

Millicent Ochieng

Sara Hooker

Andiswa Bukula

En-Shiun Annie Lee

Chiamaka Ijeoma Chukwuneke

Happy Buzaaba

Blessing Kudzaishe Sibanda

Godson Kalipe

Jonathan Mukiibi

Salomon Kabongo

Foutse Yuehgoh

M. Setaka

Lolwethu Ndolela

Nkiruka Bridget Odu … (voir 6 de plus)

Rooweither Mabuya

Shamsuddeen Hassan Muhammad

Salomey Osei

Sokhar Samb

Tadesse Kebede Guge

Pontus Stenetorp

Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languag… (voir plus)es. Additionally, many low-resource languages (e.g. African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoBench -- a human-translated benchmark dataset for 16 typologically-diverse low-resource African languages covering three tasks: natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and multi-choice knowledge-based QA~(AfriMMLU). We use IrokoBench to evaluate zero-shot, few-shot, and translate-test settings~(where test sets are translated into English) across 10 open and four proprietary LLMs. Our evaluation reveals a significant performance gap between high-resource languages~(such as English and French) and low-resource African languages. We observe a significant performance gap between open and proprietary models, with the highest performing open model, Aya-101 only at 58\% of the best-performing proprietary model GPT-4o performance. Machine translating the test set to English before evaluation helped to close the gap for larger models that are English-centric, like LLaMa 3 70B. These findings suggest that more efforts are needed to develop and adapt LLMs for African languages.

2024-06-05

ArXiv (prépublication)

IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models

Jessica Ojo

Israel Abebe Azime

Zhuang Yun Jian

Jesujoba Oluwadara Alabi

Xuanli He

Millicent Ochieng

Sara Hooker

Andiswa Bukula

En-Shiun Annie Lee

Chiamaka Ijeoma Chukwuneke

Happy Buzaaba

Blessing Kudzaishe Sibanda

Godson Kalipe

Jonathan Mukiibi

Salomon Kabongo

Foutse Yuehgoh

M. Setaka

Lolwethu Ndolela

Nkiruka Bridget Odu … (voir 6 de plus)

Rooweither Mabuya

Shamsuddeen Hassan Muhammad

Salomey Osei

Sokhar Samb

Tadesse Kebede Guge

Pontus Stenetorp

Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languag… (voir plus)es. Additionally, many low-resource languages (\eg African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoBench -- a human-translated benchmark dataset for 17 typologically-diverse low-resource African languages covering three tasks: natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and multi-choice knowledge-based question answering~(AfriMMLU). We use IrokoBench to evaluate zero-shot, few-shot, and translate-test settings~(where test sets are translated into English) across 10 open and six proprietary LLMs. Our evaluation reveals a significant performance gap between high-resource languages~(such as English and French) and low-resource African languages. We observe a significant performance gap between open and proprietary models, with the highest performing open model, Gemma 2 27B only at 63\% of the best-performing proprietary model GPT-4o performance. In addition, machine translating the test set to English before evaluation helped to close the gap for larger models that are English-centric, such as Gemma 2 27B and LLaMa 3.1 70B. These findings suggest that more efforts are needed to develop and adapt LLMs for African languages.

2024-06-05

ArXiv (prépublication)

Meta's AI translation model embraces overlooked languages.

2024-06-05

Nature (publié)

Meta's AI translation model embraces overlooked languages.

2024-06-05

Nature (publié)