Publications

Adjusting Machine Learning Decisions for Equal Opportunity and Counterfactual Fairness

Yixin Wang

David Blei

Machine learning ( ml ) methods have the potential to automate high-stakes decisions, such as bail admissions or credit lending, by analyzin… (see more)g and learning from historical data. But these algorithmic decisions may be unfair: in learning from historical data, they may replicate discriminatory practices from the past. In this paper, we propose two algorithms that adjust ﬁtted ML predictors to produce decisions that are fair. Our methods provide post-hoc adjustments to the predictors, without requiring that they be retrained. We consider a causal model of the ML decisions, deﬁne fairness through counterfactual decisions within the model, and then form algorithmic decisions that capture the historical data as well as possible, but are provably fair. In particular, we consider two deﬁnitions of fairness. The ﬁrst is “equal counterfactual opportunity,” where the counterfactual distribution of the decision is the same regardless of the protected attribute; the second is counterfactual fairness. We evaluate the algorithms, and the trade-o � between accuracy and fairness, on datasets about admissions, income, credit, and recidivism.

2023-01-01

Trans. Mach. Learn. Res. (published)

openreview.net

AfriMTE and AfriCOMET: Empowering COMET to Embrace Under-resourced African Languages

Jiayi Wang

David Ifeoluwa Adelani

Sweta Agrawal

Ricardo Rei

Eleftheria Briakou

Marine Carpuat

Marek Masiak

Xuanli He

Sofia Bourhim

Andiswa Bukula

Muhidin A. Mohamed

Temitayo Olatoye

Hamam Mokayede

Christine Mwase

Wangui Kimotho

Foutse Yuehgoh

Anuoluwapo Aremu

Jessica Ojo

Shamsuddeen Hassan Muhammad

Salomey Osei … (see 37 more)

Abdul-Hakeem Omotayo

Chiamaka Chukwuneke

Perez Ogayo

Oumaima Hourrane

Salma El Anigri

Lolwethu Ndolela

Thabiso Mangwana

Shafie Abdi Mohamed

Ayinde Hassan

Oluwabusayo Olufunke Awoyomi

Lama Alkhaled

sana Sabah al-azzawi

Naome A. Etori

Millicent A. Ochieng

Clemencia Siro

Samuel Njoroge

Eric Muchiri

Wangari Kimotho

Lyse Naomi Wamba Momo

Daud Abolade

Simbiat Ajao

Tosin P. Adewumi

Iyanuoluwa Shode

Ricky Macharm

Ruqayya Nasir Iro

Saheed Salahudeen Abdullahi

Stephen E. Moore

Bernard Opoku

Zainab Akinjobi

Abeeb Afolabi

Nnaemeka Casmir Obiefuna

Onyekachi Ogbu

Sam Brian

V. Otiende

CHINEDU EMMANUEL MBONU

Toadoum Sari Sakayo

Pontus Stenetorp

Despite the progress we have recorded in scaling multilingual machine translation (MT) models and evaluation data to several under-resourced… (see more) African languages, it is difficult to measure accurately the progress we have made on these languages because evaluation is often performed on n -gram matching metrics like BLEU that often have worse correlation with human judgments. Embedding-based metrics such as COMET correlate better; however, lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with a simplified MQM guideline for error-span annotation and direct assessment (DA) scoring for 13 typologi-cally diverse African languages. Furthermore, we develop A FRI COMET—a COMET evaluation metric for African languages by leveraging DA training data from high-resource languages and African-centric multilingual encoder (AfroXLM-Roberta) to create the state-of-the-art evaluation metric for African languages MT with respect to Spearman-rank correlation with human judgments ( +0 . 406 ).

2023-01-01

arXiv.org (preprint)

doi.org

AfriMTE and AfriCOMET: Empowering COMET to Embrace Under-resourced African Languages

Jiayi Wang

David Ifeoluwa Adelani

Sweta Agrawal

Ricardo Rei

Eleftheria Briakou

Marine Carpuat

Marek Masiak

Xuanli He

Sofia Bourhim

Andiswa Bukula

Muhidin A. Mohamed

Temitayo Olatoye

Hamam Mokayed

Christine Mwase

Wangui Kimotho

Foutse Yuehgoh

Aremu Anuoluwapo

Jessica Ojo

Shamsuddeen Hassan Muhammad

Salomey Osei … (see 37 more)

Abdul-Hakeem Omotayo

Chiamaka Ijeoma Chukwuneke

Perez Ogayo

Oumaima Hourrane

Salma El Anigri

Lolwethu Ndolela

Thabiso Mangwana

Shafie Abdi Mohamed

Ayinde Hassan

Oluwabusayo Olufunke Awoyomi

Lama Alkhaled

sana Sabah al-azzawi

Naome Etori

Millicent Ochieng

Clemencia Siro

Samuel Njoroge

Eric Muchiri

Wangari Kimotho

Lyse Naomi Wamba

Daud Abolade

Simbiat Ajao

Tosin Adewumi

Iyanuoluwa Shode

Ricky Macharm

Ruqayya Nasir Iro

Saheed Salahudeen Abdullahi

Stephen Moore

Bernard Opoku

Zainab Akinjobi

Abeeb Afolabi

Nnaemeka Casmir Obiefuna

Onyekachi Ogbu

Sam Brian

Verrah Akinyi Otiende

CHINEDU EMMANUEL MBONU

Toadoum Sari Sakayo

Pontus Stenetorp

Despite the progress we have recorded in scaling multilingual machine translation (MT) models and evaluation data to several under-resourced… (see more) African languages, it is difficult to measure accurately the progress we have made on these languages because evaluation is often performed on n -gram matching metrics like BLEU that often have worse correlation with human judgments. Embedding-based metrics such as COMET correlate better; however, lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with a simplified MQM guideline for error-span annotation and direct assessment (DA) scoring for 13 typologi-cally diverse African languages. Furthermore, we develop A FRI COMET—a COMET evaluation metric for African languages by leveraging DA training data from high-resource languages and African-centric multilingual encoder (AfroXLM-Roberta) to create the state-of-the-art evaluation metric for African languages MT with respect to Spearman-rank correlation with human judgments ( +0 . 406 ).

2023-01-01

arXiv.org (preprint)

doi.org

AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages

Shamsuddeen Hassan Muhammad

Idris Abdulmumin

Abinew Ayele

Nedjma OUSIDHOUM

David Ifeoluwa Adelani

Seid Muhie Yimam

Ibrahim Ahmad

Meriem Beloucif

Saif Mohammad

Sebastian Ruder

Oumaima Hourrane

Alipio Jorge

Pavel Brazdil

Felermino Ali

Davis David

Salomey Osei

Bello Shehu-Bello

Falalu Lawan

Tajuddeen Gwadabe

Samuel Rutunda … (see 7 more)

Tadesse Belay

Wendimu Baye Messelle

Hailu Balcha

Sisay Adugna Chala

Hagos Gebremichael

Bernard Opoku

Stephen Arthur

2023-01-01

EMNLP (published)

doi.org

openreview.net

AI Agents Learn to Trust

Ardavan S. Nobandegani

Irina Rish

T. Shultz

2023-01-01

Annual Meeting of the Cognitive Science Society (published)

dblp.uni-trier.de

AmbieGen: A Search-based Framework for Autonomous Systems Testing

Dmytro Humeniuk

Foutse Khomh

Giuliano Antoniol

2023-01-01

Sci. Comput. Program. (published)

doi.org

arxiv.org

ArK: Augmented Reality with Knowledge Emergent Infrastructure

Qiuyuan Huang

J. Park

Abhinav Gupta

Pan Lu

Paul N. Bennett

Ran Gong

Subhojit Som

Baolin Peng

Owais Khan Mohammed

Chris Pal

Yejin Choi

Jianfeng Gao

Despite the growing adoption of mixed reality and interactive AI, it remains challenging to generate high-quality 2D/3D scenes in unseen env… (see more)ironments. Typically, an AI agent requires collecting extensive training data for every new task, which can be costly or impossible for many domains. In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e.g., GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in physical or virtual worlds. Central to our approach is the interactive emerging mechanism, dubbed Augmented Reality with Knowledge Emergent Infrastructure (ArK) , which leverages knowledge-memory to generate scenes in unseen physical worlds and virtual reality environments. The knowledge interactive emergent ability (Figure 1) is demonstrated through i) micro-action of cross-modality : in multi-modality models to collect a large amount of relevant knowledge-memory data for each interaction task (e.g., unseen scene understanding) from the physical reality; and ii) macro-behavior of reality-agnostic : in mix-reality environments to improve interactions that tailor to different characterized roles, target variables, collaborative information, and so on. We validate ArK’s effectiveness in scene generation and editing tasks and show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes, highlighting its potential in applications such as metaverse and gaming simulation.

Augmenting Transit Network Design Algorithms with Deep Learning

Andrew Holliday

Gregory Dudek

This paper considers the use of deep learning models to enhance optimization algorithms for transit network design. Transit network design i… (see more)s the problem of determining routes for transit vehicles that minimize travel time and operating costs, while achieving full service coverage. State-of-the-art meta-heuristic search algorithms give good results on this problem, but can be very time-consuming. In contrast, neural networks can learn sub-optimal but fast-to-compute heuristics based on large amounts of data. Combining these approaches, we develop a fast graph neural network model for transit planning, and use it to initialize state-of-the-art search algorithms. We show that this combination can improve the results of these algorithms on a variety of metrics by up to 17%, without increasing their run time; or they can match the quality of the original algorithms while reducing the computing time by up to a factor of 50.

2023-01-01

2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC) (published)

doi.org

Auxiliary Losses for Learning Generalizable Concept-based Models

Ivaxi Sheth

Samira Ebrahimi Kahou

openreview.net

Bayes-MIL: A New Probabilistic Perspective on Attention-based Multiple Instance Learning for Whole Slide Images

Yufei Cui

Ziquan Liu

Xiangyu Liu

Xue (Steve) Liu

Cong Wang

Tei-Wei Kuo

Chun Jason Xue

Antoni Bert Chan

Multiple instance learning (MIL) is a popular weakly-supervised learning model on the whole slide image (WSI) for AI-assisted pathology diag… (see more)nosis. The recent advance in attention-based MIL allows the model to find its region-of-interest (ROI) for interpretation by learning the attention weights for image patches of WSI slides. However, we empirically find that the interpretability of some related methods is either untrustworthy as the principle of MIL is violated or unsatisfactory as the high-attention regions are not consistent with experts’ annotations. In this paper, we propose Bayes-MIL to address the problem from a probabilistic perspective. The induced patch-level uncertainty is proposed as a new measure of MIL interpretability, which outperforms previous methods in matching doctors annotations. We design a slide-dependent patch regularizer (SDPR) for the attention, imposing constraints derived from the MIL assumption, on the attention distribution. SDPR explicitly constrains the model to generate correct attention values. The spatial information is further encoded by an approximate convolutional conditional random field (CRF), for better interpretability. Experimental results show Bayes-MIL outperforms the related methods in patch-level and slide-level metrics and provides much better interpretable ROI on several large-scale WSI datasets.

2023-01-01

International Conference on Learning Representations (published)

dblp.uni-trier.de

Bayes-MIL: A New Probabilistic Perspective on Attention-based Multiple Instance Learning for Whole Slide Images

Yufei Cui

Ziquan Liu

Xiangyu Liu

Xue (Steve) Liu

Cong Wang

Tei-Wei Kuo

Chun Jason Xue

Antoni B. Chan

Multiple instance learning (MIL) is a popular weakly-supervised learning model on the whole slide image (WSI) for AI-assisted pathology diag… (see more)nosis. The recent advance in attention-based MIL allows the model to find its region-of-interest (ROI) for interpretation by learning the attention weights for image patches of WSI slides. However, we empirically find that the interpretability of some related methods is either untrustworthy as the principle of MIL is violated or unsatisfactory as the high-attention regions are not consistent with experts’ annotations. In this paper, we propose Bayes-MIL to address the problem from a probabilistic perspective. The induced patch-level uncertainty is proposed as a new measure of MIL interpretability, which outperforms previous methods in matching doctors annotations. We design a slide-dependent patch regularizer (SDPR) for the attention, imposing constraints derived from the MIL assumption, on the attention distribution. SDPR explicitly constrains the model to generate correct attention values. The spatial information is further encoded by an approximate convolutional conditional random field (CRF), for better interpretability. Experimental results show Bayes-MIL outperforms the related methods in patch-level and slide-level metrics and provides much better interpretable ROI on several large-scale WSI datasets.

2023-01-01

ICLR (published)

dblp.uni-trier.de

Benchmarking Graph Neural Networks

Vijay Prakash Dwivedi

Chaitanya K. Joshi

Thomas Laurent

Yoshua Bengio

Xavier Bresson

Graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. As the field grows, it becomes… (see more) critical to identify key architectures and validate new ideas that generalize to larger, more complex datasets. Unfortunately, it has been increasingly difficult to gauge the effectiveness of new models in the absence of a standardized benchmark with consistent experimental settings. In this paper, we introduce a reproducible GNN benchmarking framework, with the facility for researchers to add new models conveniently for arbitrary datasets. We demonstrate the usefulness of our framework by presenting a principled investigation into the recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph convolutional networks (GCNs) for a variety of graph tasks, i.e. graph regression/classification and node/link prediction, with medium-scale datasets.

2023-01-01

ArXiv (preprint)

arxiv.org

Hackathon | Building safer AI for youth mental health

Mila's Community of Practice: AI Safety

Indigenous Pathfinders in AI

AI Advantage

Publications

Hackathon | Building safer AI for youth mental health

Mila's Community of Practice: AI Safety

Indigenous Pathfinders in AI

AI Advantage

Popular keywords:

Publications