Publications

An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software

Aaditya Bhatia

Bram Adams

Ahmed E. Hassan

The emergence of open-source ML libraries such as TensorFlow and Google Auto ML has enabled developers to harness state-of-the-art ML algori… (see more)thms with minimal overhead. However, during this accelerated ML development process, said developers may often make sub-optimal design and implementation decisions, leading to the introduction of technical debt that, if not addressed promptly, can have a significant impact on the quality of the ML-based software. Developers frequently acknowledge these sub-optimal design and development choices through code comments during software development. These comments, which often highlight areas requiring additional work or refinement in the future, are known as self-admitted technical debt (SATD). This paper aims to investigate SATD in ML code by analyzing 318 open-source ML projects across five domains, along with 318 non-ML projects. We detected SATD in source code comments throughout the different project snapshots, conducted a manual analysis of the identified SATD sample to comprehend the nature of technical debt in the ML code, and performed a survival analysis of the SATD to understand the evolution of such debts. We observed: i) Machine learning projects have a median percentage of SATD that is twice the median percentage of SATD in non-machine learning projects. ii) ML pipeline components for data preprocessing and model generation logic are more susceptible to debt than model validation and deployment components. iii) SATDs appear in ML projects earlier in the development process compared to non-ML projects. iv) Long-lasting SATDs are typically introduced during extensive code changes that span multiple files exhibiting low complexity.

2023-11-20

ArXiv (preprint)

doi.org

arxiv.org

Responsible AI Research Needs Impact Statements Too

Alexandra Olteanu

Michael Ekstrand

Carlos Castillo

Jina Suh

All types of research, development, and policy work can have unintended, adverse consequences - work in responsible artificial intelligence … (see more)(RAI), ethical AI, or ethics in AI is no exception.

2023-11-20

ArXiv (preprint)

doi.org

arxiv.org

Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges

Massimo Caccia

Jonas Mueller

Taesup Kim

Laurent Charlin

Rasool Fakoor

2023-11-20

Proceedings of The 2nd Conference on Lifelong Learning Agents (published)

proceedings.mlr.press

openreview.net

Tensor-based Space Debris Detection for Satellite Mega-constellations

Olivier Daoust

Hasan Nayir

Irfan Azam

Antoine Lesage-Landry

G. Kurt

2023-11-20

ArXiv (preprint)

arxiv.org

Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In the Game of Hanabi

Hadi Nekoei

Xutong Zhao

Janarthanan Rajendran

Miao Liu

Sarath Chandar Anbil Parthipan

2023-11-20

Proceedings of The 2nd Conference on Lifelong Learning Agents (published)

doi.org

arxiv.org

Inferring dynamic regulatory interaction graphs from time series data with perturbations

Dhananjay Bhaskar

Daniel Sumner Magruder

Edward De Brouwer

Matheo Morales

Aarthi Venkat

Frederik Wenkel

Guy Wolf

Smita Krishnaswamy

2023-11-18

logconference.io/LOG/2023/Conference (poster)

doi.org

openreview.net

MUDiff: Unified Diffusion for Complete Molecule Generation

Chenqing Hua

Sitao Luan

Minkai Xu

Zhitao Ying

Rex Ying

Jie Fu

Stefano Ermon

Doina Precup

2023-11-18

logconference.io/LOG/2023/Conference (poster)

doi.org

openreview.net

The evidence mismatch in pediatric surgical practice

Marina Broomfield

Zena Agabani

Elena Guadagno

Dan Poenaru

Robert Baird

2023-11-18

Pediatric surgery international (Print) (published)

doi.org

Differentiable visual computing for inverse problems and machine learning

Andrew Spielberg

Fangcheng Zhong

Konstantinos Rematas

Krishna Murthy

Cengiz Oztireli

Tzu-Mao Li

Derek Nowrouzezahrai

2023-11-17

Nature Machine Intelligence (published)

doi.org

arxiv.org

AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages

Jiayi Wang

David Ifeoluwa Adelani

Sweta Agrawal

Marek Masiak

Ricardo Rei

Eleftheria Briakou

Marine Carpuat

Xuanli He

Sofia Bourhim

Andiswa Bukula

Muhidin A. Mohamed

Temitayo Olatoye

Tosin Adewumi

Hamam Mokayede

Christine Mwase

Wangui Kimotho

Foutse Yuehgoh

Aremu Anuoluwapo

Jessica Ojo

Shamsuddeen Hassan Muhammad … (see 38 more)

Salomey Osei

Abdul-Hakeem Omotayo

Chiamaka Ijeoma Chukwuneke

Perez Ogayo

Oumaima Hourrane

Salma El Anigri

Lolwethu Ndolela

Thabiso Mangwana

Shafie Abdi Mohamed

Ayinde Hassan

Oluwabusayo Olufunke Awoyomi

Lama Alkhaled

sana Sabah al-azzawi

Naome A. Etori

Millicent A. Ochieng

Clemencia Siro

Samuel Njoroge

Eric Muchiri

Wangari Kimotho

Lyse Naomi Wamba Momo

Daud Abolade

Simbiat Ajao

Iyanuoluwa Shode

Ricky Macharm

Ruqayya Nasir Iro

Saheed Salahudeen Abdullahi

Stephen E. Moore

Bernard Opoku

Zainab Akinjobi

Abeeb Afolabi

Nnaemeka Casmir Obiefuna

Onyekachi Ogbu

Sam Brian

Verrah Akinyi Otiende

CHINEDU EMMANUEL MBONU

Toadoum Sari Sakayo

Yao Lu

Pontus Stenetorp

Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measur… (see more)ing this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-the-art MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441).

2023-11-16

ArXiv (preprint)

arxiv.org

Evaluating In-Context Learning of Libraries for Code Generation

Arkil Patel

Siva Reddy

Dzmitry Bahdanau

Pradeep Dasigi

2023-11-16

ArXiv (preprint)

doi.org

arxiv.org

Generalizable Imitation Learning Through Pre-Trained Representations

Wei-Di Chang

Francois R. Hogan

David Meger

Gregory Dudek

In this paper we leverage self-supervised vision transformer models and their emergent semantic abilities to improve the generalization abil… (see more)ities of imitation learning policies. We introduce BC-ViT, an imitation learning algorithm that leverages rich DINO pre-trained Visual Transformer (ViT) patch-level embeddings to obtain better generalization when learning through demonstrations. Our learner sees the world by clustering appearance features into semantic concepts, forming stable keypoints that generalize across a wide range of appearance variations and object types. We show that this representation enables generalized behaviour by evaluating imitation learning across a diverse dataset of object manipulation tasks. Our method, data and evaluation approach are made available to facilitate further study of generalization in Imitation Learners.

2023-11-15

ArXiv (preprint)

doi.org

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications