Negar Rostamzadeh

Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms

Renee Shelby

2025-10-15

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (published)

doi.org

arxiv.org

Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms

Renee Shelby

Over the past decade, an ecosystem of measures has emerged to evaluate the social and ethical implications of AI systems, largely shaped by … (see more)high-level ethics principles. These measures are developed and used in fragmented ways, without adequate attention to how they are situated in AI systems. In this paper, we examine how existing measures used in the computing literature map to AI system components, attributes, hazards, and harms. Our analysis draws on a scoping review resulting in nearly 800 measures corresponding to 11 AI ethics principles. We find that most measures focus on four principles – fairness, transparency, privacy, and trust – and primarily assess model or output system components. Few measures account for interactions across system elements, and only a narrow set of hazards is typically considered for each harm type. Many measures are disconnected from where harm is experienced and lack guidance for setting meaningful thresholds. These patterns reveal how current evaluation practices remain fragmented, measuring in pieces rather than capturing how harms emerge across systems. Framing measures with respect to system attributes, hazards, and harms can strengthen regulatory oversight, support actionable practices in industry, and ground future research in systems-level understanding.

2025-10-15

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (published)

doi.org

arxiv.org

Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms

Renee Shelby

Over the past decade, an ecosystem of measures has emerged to evaluate the social and ethical implications of AI systems, largely shaped by … (see more)high-level ethics principles. These measures are developed and used in fragmented ways, without adequate attention to how they are situated in AI systems. In this paper, we examine how existing measures used in the computing literature map to AI system components, attributes, hazards, and harms. Our analysis draws on a scoping review resulting in nearly 800 measures corresponding to 11 AI ethics principles. We find that most measures focus on four principles – fairness, transparency, privacy, and trust – and primarily assess model or output system components. Few measures account for interactions across system elements, and only a narrow set of hazards is typically considered for each harm type. Many measures are disconnected from where harm is experienced and lack guidance for setting meaningful thresholds. These patterns reveal how current evaluation practices remain fragmented, measuring in pieces rather than capturing how harms emerge across systems. Framing measures with respect to system attributes, hazards, and harms can strengthen regulatory oversight, support actionable practices in industry, and ground future research in systems-level understanding.

2025-10-11

ArXiv (preprint)

doi.org

arxiv.org

Bias-inducing geometries: an exactly solvable data model with fairness implications

Stefano Sarao Mannelli

Federica Gerace

Negar Rostamzadeh

Luca Saglietti

2025-08-07

Physical Review E (published)

doi.org

arxiv.org

UNLEARNING GEO-CULTURAL STEREOTYPES IN MULTILINGUAL LLMS

Alireza Dehghanpour Farashah

Aditi Khandelwal

Negar Rostamzadeh

Golnoosh Farnadi

As multilingual generative models become more widely used, most safety and fairness evaluation techniques still focus on English-language re… (see more)sources, while overlooking important cross-cultural factors. This limitation raises concerns about fairness and safety, particularly regarding geoculturally situated stereotypes that hinder the models’ global inclusivity. In this work, we present preliminary findings on the impact of stereotype unlearning across languages, specifically in English, French, and Hindi. Using an adapted version of the SeeGULL dataset, we analyze how unlearning stereotypes in one language influences other languages within multilingual large language models. Our study evaluates two model families, Llama-3.1-8B and Aya-Expanse-8B, to assess whether unlearning in one linguistic context transfers across languages, potentially mitigating or exacerbating biases in multilingual settings.

2025-03-05

ICLR.cc/2025/Workshop/BuildingTrust (accepted)

openreview.net

What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models

Ahmed Imtiaz Humayun

Ibtihel Amara

Candice Schumann

Cristina Nader Vasconcelos

Golnoosh Farnadi

Deepak Ramachandran

Negar Rostamzadeh

Junfeng He

Mohammad Havaei

Katherine Heller

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models

Ahmed Imtiaz Humayun

Ibtihel Amara

Cristina Nader Vasconcelos

Deepak Ramachandran

Candice Schumann

Junfeng He

Katherine A Heller

Golnoosh Farnadi

Negar Rostamzadeh

Mohammad Havaei

Deep Generative Models are frequently used to learn continuous representations of complex data distributions using a finite number of sample… (see more)s. For any generative model, including pre-trained foundation models with GAN, Transformer or Diffusion architectures, generation performance can vary significantly based on which part of the learned data manifold is sampled. In this paper we study the post-training local geometry of the learned manifold and its relationship to generation outcomes for models ranging from toy settings to the latent decoder of the near state-of-the-art Stable Diffusion 1.4 Text-to-Image model. Building on the theory of continuous piecewise-linear (CPWL) generators, we characterize the local geometry in terms of three geometric descriptors - scaling (

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

Nteasee: Understanding Needs in AI for Health in Africa -- A Mixed-Methods Study of Expert and General Population Perspectives

Mercy Nyamewaa Asiedu

Iskandar Haykel

Awa Dieng

K. Kauer

Tousif Ahmed

Florence Ofori

Charisma Chan

Stephen R. Pfohl

Negar Rostamzadeh

Katherine Heller

Artificial Intelligence (AI) for health has the potential to significantly change and improve healthcare. However in most African countries,… (see more) identifying culturally and contextually attuned approaches for deploying these solutions is not well understood. To bridge this gap, we conduct a qualitative study to investigate the best practices, fairness indicators, and potential biases to mitigate when deploying AI for health in African countries, as well as explore opportunities where artificial intelligence could make a positive impact in health. We used a mixed methods approach combining in-depth interviews (IDIs) and surveys. We conduct 1.5-2 hour long IDIs with 50 experts in health, policy, and AI across 17 countries, and through an inductive approach we conduct a qualitative thematic analysis on expert IDI responses. We administer a blinded 30-minute survey with case studies to 672 general population participants across 5 countries in Africa and analyze responses on quantitative scales, statistically comparing responses by country, age, gender, and level of familiarity with AI. We thematically summarize open-ended responses from surveys. Our results find generally positive attitudes, high levels of trust, accompanied by moderate levels of concern among general population participants for AI usage for health in Africa. This contrasts with expert responses, where major themes revolved around trust/mistrust, ethical concerns, and systemic barriers to integration, among others. This work presents the first-of-its-kind qualitative research study of the potential of AI for health in Africa from an algorithmic fairness angle, with perspectives from both experts and the general population. We hope that this work guides policymakers and drives home the need for further research and the inclusion of general population perspectives in decision-making around AI usage.

2025-01-01

FAccT (published)

doi.org

arxiv.org

The Case for Globalizing Fairness: A Mixed Methods Study on Colonialism, AI, and Health in Africa

Mercy Nyamewaa Asiedu

Awa Dieng

Iskandar Haykel

Negar Rostamzadeh

Stephen R. Pfohl

Chirag Nagpal

Maria Nagawa

Abigail Oppong

Sanmi Koyejo

Katherine Heller

With growing application of machine learning (ML) technologies in healthcare, there have been calls for developing techniques to understand … (see more)and mitigate biases these systems may exhibit. Fair-ness considerations in the development of ML-based solutions for health have particular implications for Africa, which already faces inequitable power imbalances between the Global North and South.This paper seeks to explore fairness for global health, with Africa as a case study. We conduct a scoping review to propose axes of disparities for fairness consideration in the African context and delineate where they may come into play in different ML-enabled medical modalities. We then conduct qualitative research studies with 672 general population study participants and 28 experts inML, health, and policy focused on Africa to obtain corroborative evidence on the proposed axes of disparities. Our analysis focuses on colonialism as the attribute of interest and examines the interplay between artificial intelligence (AI), health, and colonialism. Among the pre-identified attributes, we found that colonial history, country of origin, and national income level were specific axes of disparities that participants believed would cause an AI system to be biased.However, there was also divergence of opinion between experts and general population participants. Whereas experts generally expressed a shared view about the relevance of colonial history for the development and implementation of AI technologies in Africa, the majority of the general population participants surveyed did not think there was a direct link between AI and colonialism. Based on these findings, we provide practical recommendations for developing fairness-aware ML solutions for health in Africa.

2024-10-29

Proceedings of the 4th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (published)

doi.org

arxiv.org

A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models

Stephen R. Pfohl

Heather Cole-Lewis

Rory Sayres

Darlene Neal

Mercy Nyamewaa Asiedu

Awa Dieng

Nenad Tomasev

Qazi Mamunur Rashid

Shekoofeh Azizi

Negar Rostamzadeh

Liam G. McCoy

Leo Anthony Celi

Yun Liu

Mike Schaekermann

Alanna Walton

Alicia Parrish

Chirag Nagpal

Preeti Singh

Akeiylah Dewitt

Philip Mansfield … (see 10 more)

Sushant Prakash

Katherine Heller

Alan Karthikesalingam

Christopher Semturs

Joelle Barral

Greg Corrado

Yossi Matias

Jamila Smith-Loud

Ivor Horn

Karan Singhal

2024-09-23

Nature Medicine (published)

doi.org

arxiv.org

Nteasee: A mixed methods study of expert and general population perspectives on deploying AI for health in African countries

Mercy Nyamewaa Asiedu

Iskandar Haykel

Awa Dieng

K. Kauer

Tousif Ahmed

Florence Ofori

Charisma Chan

Stephen R. Pfohl

Negar Rostamzadeh

Katherine Heller

2024-09-04

ArXiv (preprint)

doi.org

arxiv.org

Nteasee: Understanding Needs in AI for Health in Africa -- A Mixed-Methods Study of Expert and General Population Perspectives

Mercy Nyamewaa Asiedu

Iskandar Haykel

Awa Dieng

K. Kauer

Tousif Ahmed

Florence Ofori

Charisma Chan

Stephen R. Pfohl

Negar Rostamzadeh

Katherine Heller

Artificial Intelligence (AI) for health has the potential to significantly change and improve healthcare. However in most African countries,… (see more) identifying culturally and contextually attuned approaches for deploying these solutions is not well understood. To bridge this gap, we conduct a qualitative study to investigate the best practices, fairness indicators, and potential biases to mitigate when deploying AI for health in African countries, as well as explore opportunities where artificial intelligence could make a positive impact in health. We used a mixed methods approach combining in-depth interviews (IDIs) and surveys. We conduct 1.5-2 hour long IDIs with 50 experts in health, policy, and AI across 17 countries, and through an inductive approach we conduct a qualitative thematic analysis on expert IDI responses. We administer a blinded 30-minute survey with case studies to 672 general population participants across 5 countries in Africa and analyze responses on quantitative scales, statistically comparing responses by country, age, gender, and level of familiarity with AI. We thematically summarize open-ended responses from surveys. Our results find generally positive attitudes, high levels of trust, accompanied by moderate levels of concern among general population participants for AI usage for health in Africa. This contrasts with expert responses, where major themes revolved around trust/mistrust, ethical concerns, and systemic barriers to integration, among others. This work presents the first-of-its-kind qualitative research study of the potential of AI for health in Africa from an algorithmic fairness angle, with perspectives from both experts and the general population. We hope that this work guides policymakers and drives home the need for further research and the inclusion of general population perspectives in decision-making around AI usage.

2024-09-04

ArXiv (preprint)

doi.org

arxiv.org

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Supervision Requests

Negar Rostamzadeh

Publications

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Supervision Requests

Popular keywords:

Negar Rostamzadeh

Publications