Publications

Meta's AI translation model embraces overlooked languages.
Noisy Data Visualization using Functional Data Analysis
Haozhe Chen
Andres Felipe Duque Correa
Kevin R. Moon
Data visualization via dimensionality reduction is an important tool in exploratory data analysis. However, when the data are noisy, many ex… (see more)isting methods fail to capture the underlying structure of the data. The method called Empirical Intrinsic Geometry (EIG) was previously proposed for performing dimensionality reduction on high dimensional dynamical processes while theoretically eliminating all noise. However, implementing EIG in practice requires the construction of high-dimensional histograms, which suffer from the curse of dimensionality. Here we propose a new data visualization method called Functional Information Geometry (FIG) for dynamical processes that adapts the EIG framework while using approaches from functional data analysis to mitigate the curse of dimensionality. We experimentally demonstrate that the resulting method outperforms a variant of EIG designed for visualization in terms of capturing the true structure, hyperparameter robustness, and computational speed. We then use our method to visualize EEG brain measurements of sleep activity.
A Robot Walks into a Bar: Can Language Models Serve as Creativity SupportTools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians
Piotr Mirowski
Juliette Love
Shakir Mohamed
Temporal trends in disparities in COVID-19 seropositivity among Canadian blood donors
Yuan Yu
Matthew J Knight
Diana Gibson
Sheila F O’Brien
W Alton Russell
Abstract Background In Canada’s largest COVID-19 serological study, SARS-CoV-2 antibodies in blood donors have been monitored since 2020. … (see more)No study has analysed changes in the association between anti-N seropositivity (a marker of recent infection) and geographic and sociodemographic characteristics over the pandemic. Methods Using Bayesian multi-level models with spatial effects at the census division level, we analysed changes in correlates of SARS-CoV-2 anti-N seropositivity across three periods in which different variants predominated (pre-Delta, Delta and Omicron). We analysed disparities by geographic area, individual traits (age, sex, race) and neighbourhood factors (urbanicity, material deprivation and social deprivation). Data were from 420 319 blood donations across four regions (Ontario, British Columbia [BC], the Prairies and the Atlantic region) from December 2020 to November 2022. Results Seropositivity was higher for racialized minorities, males and individuals in more materially deprived neighbourhoods in the pre-Delta and Delta waves. These subgroup differences dissipated in the Omicron wave as large swaths of the population became infected. Across all waves, seropositivity was higher in younger individuals and those with lower neighbourhood social deprivation. Rural residents had high seropositivity in the Prairies, but not other regions. Compared to generalized linear models, multi-level models with spatial effects had better fit and lower error when predicting SARS-CoV-2 anti-N seropositivity by geographic region. Conclusions Correlates of recent COVID-19 infection have evolved over the pandemic. Many disparities lessened during the Omicron wave, but public health intervention may be warranted to address persistently higher burden among young people and those with less social deprivation.
Towards Geographic Inclusion in the Evaluation of Text-to-Image Models
Melissa Hall
Samuel J. Bell
Candace Ross
Adina Williams
Michal Drozdzal
Rapid progress in text-to-image generative models coupled with their deployment for visual content creation has magnified the importance of … (see more)thoroughly evaluating their performance and identifying potential biases. In pursuit of models that generate images that are realistic, diverse, visually appealing, and consistent with the given prompt, researchers and practitioners often turn to automated metrics to facilitate scalable and cost-effective performance profiling. However, commonly-used metrics often fail to account for the full diversity of human preference; often even in-depth human evaluations face challenges with subjectivity, especially as interpretations of evaluation criteria vary across regions and cultures. In this work, we conduct a large, cross-cultural study to study how much annotators in Africa, Europe, and Southeast Asia vary in their perception of geographic representation, visual appeal, and consistency in real and generated images from state-of-the art public APIs. We collect over 65,000 image annotations and 20 survey responses. We contrast human annotations with common automated metrics, finding that human preferences vary notably across geographic location and that current metrics do not fully account for this diversity. For example, annotators in different locations often disagree on whether exaggerated, stereotypical depictions of a region are considered geographically representative. In addition, the utility of automatic evaluations is dependent on assumptions about their set-up, such as the alignment of feature extractors with human perception of object similarity or the definition of"appeal"captured in reference datasets used to ground evaluations. We recommend steps for improved automatic and human evaluations.
Efficient Leverage Score Sampling for Tensor Train Decomposition
Vivek Bharadwaj
Beheshteh T. Rakhshan
Osman Asif Malik
Milnor-Myerson Games and The Principles of Artificial Principal-Agent Problems
Manfred Diaz
Joel Z Leibo
In this paper, we introduce Milnor-Myerson games, a multiplayer interaction structure at the core of machine learning (ML), to shed light on… (see more) the fundamental principles and implications the artificial principal-agent problem has had in landmark ML results like AlphaGo and large language models (LLMs).
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning
Zhaohan Daniel Guo
Bernardo Avila Pires
Yunhao Tang
Clare Lyle
Mark Rowland
Nicolas Heess
Diana Borsa
Arthur Guez
Will Dabney
MOSEAC: Streamlined Variable Time Step Reinforcement Learning
Dong Wang
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages
Jiayi Wang
Sweta Agrawal
Marek Masiak
Ricardo Rei
Eleftheria Briakou
Marine Carpuat
Xuanli He
Sofia Bourhim
Andiswa Bukula
Muhidin A. Mohamed
Temitayo Olatoye
Tosin Adewumi
Hamam Mokayed
Christine Mwase
Wangui Kimotho
Foutse Yuehgoh
Aremu Anuoluwapo
Jessica Ojo
Shamsuddeen Hassan Muhammad … (see 41 more)
Salomey Osei
Abdul-Hakeem Omotayo
Chiamaka Ijeoma Chukwuneke
Perez Ogayo
Oumaima Hourrane
Salma El Anigri
Lolwethu Ndolela
Thabiso Mangwana
Shafie Abdi Mohamed
Hassan Ayinde
Ayinde Hassan
Oluwabusayo Olufunke Awoyomi
Lama Alkhaled
sana Sabah al-azzawi
Naome Etori
Millicent Ochieng
Clemencia Siro
Njoroge Kiragu
Samuel Njoroge
Eric Muchiri
Wangari Kimotho
Lyse Naomi Wamba
Daud Abolade
Simbiat Ajao
Iyanuoluwa Shode
Ricky Macharm
Ruqayya Nasir Iro
Saheed Salahudeen Abdullahi
Stephen Moore
Bernard Opoku
Zainab Akinjobi
Abeeb Afolabi
Nnaemeka Casmir Obiefuna
Onyekachi Ogbu
Sam Brian
Sam Ochieng’
Verrah Akinyi Otiende
CHINEDU EMMANUEL MBONU
Toadoum Sari Sakayo
Yao Lu
Pontus Stenetorp
Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measur… (see more)ing this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-the-art MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441).
Better entity matching with transformers through ensembles
Jwen Fai Low
Pulei Xiong
Caffeine induces age-dependent increases in brain complexity and criticality during sleep
Philipp Thölke
Maxine Arcand-Lavigne
Tarek Lajnef
Sonia Frenette
Julie Carrier