Publications

Meta’s AI translation model embraces overlooked languages

David I. Adelani

2024-06-04

Nature (unknown)

doi.org

Noisy Data Visualization using Functional Data Analysis

Haozhe Chen

Andres Felipe Duque Correa

Guy Wolf

Kevin R. Moon

Data visualization via dimensionality reduction is an important tool in exploratory data analysis. However, when the data are noisy, many ex… (see more)isting methods fail to capture the underlying structure of the data. The method called Empirical Intrinsic Geometry (EIG) was previously proposed for performing dimensionality reduction on high dimensional dynamical processes while theoretically eliminating all noise. However, implementing EIG in practice requires the construction of high-dimensional histograms, which suffer from the curse of dimensionality. Here we propose a new data visualization method called Functional Information Geometry (FIG) for dynamical processes that adapts the EIG framework while using approaches from functional data analysis to mitigate the curse of dimensionality. We experimentally demonstrate that the resulting method outperforms a variant of EIG designed for visualization in terms of capturing the true structure, hyperparameter robustness, and computational speed. We then use our method to visualize EEG brain measurements of sleep activity.

2024-06-04

ArXiv (preprint)

doi.org

arxiv.org

A Robot Walks into a Bar: Can Language Models Serve as Creativity SupportTools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians

Piotr Mirowski

J Christopher Love

Juliette Love

Kory Mathewson

Shakir Mohamed

2024-06-04

The 2024 ACM Conference on Fairness, Accountability, and Transparency (published)

doi.org

arxiv.org

Towards Geographic Inclusion in the Evaluation of Text-to-Image Models

Melissa Hall

Samuel J. Bell

Candace Ross

Adina Williams

Michal Drozdzal

Adriana Romero

Rapid progress in text-to-image generative models coupled with their deployment for visual content creation has magnified the importance of … (see more)thoroughly evaluating their performance and identifying potential biases. In pursuit of models that generate images that are realistic, diverse, visually appealing, and consistent with the given prompt, researchers and practitioners often turn to automated metrics to facilitate scalable and cost-effective performance profiling. However, commonly-used metrics often fail to account for the full diversity of human preference; often even in-depth human evaluations face challenges with subjectivity, especially as interpretations of evaluation criteria vary across regions and cultures. In this work, we conduct a large, cross-cultural study to study how much annotators in Africa, Europe, and Southeast Asia vary in their perception of geographic representation, visual appeal, and consistency in real and generated images from state-of-the art public APIs. We collect over 65,000 image annotations and 20 survey responses. We contrast human annotations with common automated metrics, finding that human preferences vary notably across geographic location and that current metrics do not fully account for this diversity. For example, annotators in different locations often disagree on whether exaggerated, stereotypical depictions of a region are considered geographically representative. In addition, the utility of automatic evaluations is dependent on assumptions about their set-up, such as the alignment of feature extractors with human perception of object similarity or the definition of"appeal"captured in reference datasets used to ground evaluations. We recommend steps for improved automatic and human evaluations.

2024-06-04

The 2024 ACM Conference on Fairness, Accountability, and Transparency (published)

doi.org

arxiv.org

Visibility into AI Agents

Alan Chan

Carson Ezell

Max Kaufmann

Kevin Wei

Lewis Hammond

Herbie Bradley

Emma Bluemke

Nitarshan Rajkumar

David Krueger

Noam Kolt

Lennart Heim

Markus Anderljung

Increased delegation of commercial, scientific, governmental, and personal activities to AI agents -- systems capable of pursuing complex go… (see more)als with limited supervision -- may exacerbate existing societal risks and introduce new risks. Understanding and mitigating these risks involves critically evaluating existing governance structures, revising and adapting these structures where needed, and ensuring accountability of key stakeholders. Information about where, why, how, and by whom certain AI agents are used, which we refer to as visibility, is critical to these objectives. In this paper, we assess three categories of measures to increase visibility into AI agents: agent identifiers, real-time monitoring, and activity logging. For each, we outline potential implementations that vary in intrusiveness and informativeness. We analyze how the measures apply across a spectrum of centralized through decentralized deployment contexts, accounting for various actors in the supply chain including hardware and software service providers. Finally, we discuss the implications of our measures for privacy and concentration of power. Further work into understanding the measures and mitigating their negative impacts can help to build a foundation for the governance of AI agents.

2024-06-04

The 2024 ACM Conference on Fairness, Accountability, and Transparency (published)

doi.org

arxiv.org

Milnor-Myerson Games and The Principles of Artificial Principal-Agent Problems

Manfred Diaz

Joel Z Leibo

Liam Paull

In this paper, we introduce Milnor-Myerson games, a multiplayer interaction structure at the core of machine learning (ML), to shed light on… (see more) the fundamental principles and implications the artificial principal-agent problem has had in landmark ML results like AlphaGo and large language models (LLMs).

2024-06-03

rl-conference.cc/RLC/2024/Workshop/Finding_the_Frame (poster)

openreview.net

From Feature Visualization to Visual Circuits: Effect of Adversarial Model Manipulation

Geraldin Nanfack

Michael Eickenberg

Eugene Belilovsky

Understanding the inner working functionality of large-scale deep neural networks is challenging yet crucial in several high-stakes applicat… (see more)ions. Mechanistic inter- pretability is an emergent field that tackles this challenge, often by identifying human-understandable subgraphs in deep neural networks known as circuits. In vision-pretrained models, these subgraphs are usually interpreted by visualizing their node features through a popular technique called feature visualization. Recent works have analyzed the stability of different feature visualization types under the adversarial model manipulation framework. This paper starts by addressing limitations in existing works by proposing a novel attack called ProxPulse that simultaneously manipulates the two types of feature visualizations. Surprisingly, when analyzing these attacks under the umbrella of visual circuits, we find that visual circuits show some robustness to ProxPulse. We, therefore, introduce a new attack based on ProxPulse that unveils the manipulability of visual circuits, shedding light on their lack of robustness. The effectiveness of these attacks is validated using pre-trained AlexNet and ResNet-50 models on ImageNet.

2024-06-02

ArXiv (preprint)

doi.org

arxiv.org

MOSEAC: Streamlined Variable Time Step Reinforcement Learning

Yong Wang

Giovanni Beltrame

2024-06-02

ArXiv (preprint)

doi.org

arxiv.org

Political Dynasties in Canada

Alex B. Rivard

Jean-François Godbout

Marc André Bodet

Using a unique dataset of legislators' electoral and biographical data in the Canadian provinces of Ontario, Quebec, New Brunswick, Nova Sco… (see more)tia and the federal parliament, this article analyses the extent to which family dynasties affected the career development of legislators since the mid-18th century. We find that the prevalence of dynasties was higher in provincial legislatures than it was in the federal parliament, that the number of dynasties in the Senate increased until the mid-20th century, and that the proportion of dynastic legislators at the subnational level was similar to the numbers seen in the United Kingdom during the early 19th century. Our results confirm the existence of a clear career benefit in terms of cabinet and senate appointments. In contrast to the American case and in line with the United Kingdom experience, we find no causal relationship between a legislator's tenure length and the presence of a dynasty.

2024-06-02

Government and Opposition (published)

doi.org

AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages

Jiayi Wang

David Ifeoluwa Adelani

Sweta Agrawal

Marek Masiak

Ricardo Rei

Eleftheria Briakou

Marine Carpuat

Xuanli He

Sofia Bourhim

Andiswa Bukula

Muhidin A. Mohamed

Temitayo Olatoye

Tosin Adewumi

Hamam Mokayed

Christine Mwase

Wangui Kimotho

Foutse Yuehgoh

Aremu Anuoluwapo

Jessica Ojo

Shamsuddeen Hassan Muhammad … (see 41 more)

Salomey Osei

Abdul-Hakeem Omotayo

Chiamaka Ijeoma Chukwuneke

Perez Ogayo

Oumaima Hourrane

Salma El Anigri

Lolwethu Ndolela

Thabiso Mangwana

Shafie Abdi Mohamed

Hassan Ayinde

Ayinde Hassan

Oluwabusayo Olufunke Awoyomi

Lama Alkhaled

sana Sabah al-azzawi

Naome Etori

Millicent Ochieng

Clemencia Siro

Samuel Njoroge

Njoroge Kiragu

Eric Muchiri

Wangari Kimotho

Lyse Naomi Wamba

Daud Abolade

Simbiat Ajao

Iyanuoluwa Shode

Ricky Macharm

Ruqayya Nasir Iro

Saheed Salahudeen Abdullahi

Stephen Moore

Bernard Opoku

Zainab Akinjobi

Abeeb Afolabi

Nnaemeka Casmir Obiefuna

Onyekachi Ogbu

Sam Brian

Sam Ochieng’

Verrah Akinyi Otiende

CHINEDU EMMANUEL MBONU

Toadoum Sari Sakayo

Yao Lu

Pontus Stenetorp

Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measur… (see more)ing this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-the-art MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441).

2024-05-31

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) (published)

doi.org

arxiv.org

Better entity matching with transformers through ensembles

Jwen Fai Low

Benjamin C. M. Fung

Pulei Xiong

2024-05-31

Knowledge-Based Systems (published)

doi.org

Efficient Evolutionary Search Over Chemical Space with Large Language Models

Haorui Wang

Marta Skreta

Cher Tian Ser

Wenhao Gao

Lingkai Kong

Felix Streith-Kalthoff

Chenru Duan

Yuchen Zhuang

Yue Yu

Yanqiao Zhu 0001

Yuanqi Du

Alán Aspuru-Guzik

Kirill Neklyudov

Chao Zhang

Molecular discovery, when formulated as an optimization problem, presents significant computational challenges because optimization objectiv… (see more)es can be non-differentiable. Evolutionary Algorithms (EAs), often used to optimize black-box objectives in molecular discovery, traverse chemical space by performing random mutations and crossovers, leading to a large number of expensive objective evaluations. In this work, we ameliorate this shortcoming by incorporating chemistry-aware Large Language Models (LLMs) into EAs. Namely, we redesign crossover and mutation operations in EAs using LLMs trained on large corpora of chemical information. We perform extensive empirical studies on both commercial and open-source models on multiple tasks involving property optimization, molecular rediscovery, and structure-based drug design, demonstrating that the joint usage of LLMs with EAs yields superior performance over all baseline models across single- and multi-objective settings. We demonstrate that our algorithm improves both the quality of the final solution and convergence speed, thereby reducing the number of required objective evaluations. Our code is available at http://github.com/zoom-wang112358/MOLLEO

2024-05-31

arXiv (published)

doi.org

arxiv.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications