Prakhar Ganesh

Awa Dieng

Miriam Rateike

Jamelle Watson-Daniels

Jessica Schrouff

Sanmi Koyejo

AI has transitioned from predictive models to interactive, autonomous agents capable of reasoning, planning, and executing complex goals. As… (voir plus) the systems increasingly influence social, economic, and scientific decisions, they determine whose interests are represented and whose opportunities are constrained. Ensuring fairness, therefore, is no longer an ethical preference but a practical imperative. As the fairness challenges are fundamentally transformed by advanced AI systems, traditional algorithmic fairness frameworks developed primarily for prediction and/or prediction-based decision-making no longer suffice. This workshop, _Algorithmic Fairness Across Alignment Procedures and Agentic Systems_ (AFAA), emerges at this pivotal moment as a timely forum for rethinking fairness in AI alignment processes and agentic system development. By examining fairness across alignment procedures and agentic systems, this workshop creates a crucial platform for bridging the gap between rapid technical advances in model capabilities and the equally important advances needed in frameworks of algorithmic fairness to govern these powerful systems.

2025-12-31

Workshop Proposals @ International Conference on Learning Representations (publié)

openreview.net

Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework

Cléa Chataigner

Rebecca Ma

Afaf Taïk

Elliot Creager

2025-09-21

NeurIPS.cc/2025/Workshop/WiML (publié)

openreview.net

Different Horses for Different Courses: Comparing Bias Mitigation Algorithms in ML

Usman Gohar

Lu Cheng

With fairness concerns gaining significant attention in Machine Learning (ML), several bias mitigation techniques have been proposed, often … (voir plus)compared against each other to find the best method. These benchmarking efforts tend to use a common setup for evaluation under the assumption that providing a uniform environment ensures a fair comparison. However, bias mitigation techniques are sensitive to hyperparameter choices, random seeds, feature selection, etc., meaning that comparison on just one setting can unfairly favour certain algorithms. In this work, we show significant variance in fairness achieved by several algorithms and the influence of the learning pipeline on fairness scores. We highlight that most bias mitigation techniques can achieve comparable performance, given the freedom to perform hyperparameter optimization, suggesting that the choice of the evaluation parameters-rather than the mitigation technique itself-can sometimes create the perceived superiority of one method over another. We hope our work encourages future research on how various choices in the lifecycle of developing an algorithm impact fairness, and trends that guide the selection of appropriate algorithms.

2025-04-22

Proceedings of the Algorithmic Fairness Through the Lens of Metrics and Evaluation (publié)

proceedings.mlr.press

Rethinking Hallucinations: Correctness, Consistency, and Prompt Multiplicity

Reza Shokri

Large language models (LLMs) are known to "hallucinate" by generating false or misleading outputs. Hallucinations pose various harms, from e… (voir plus)rosion of trust to widespread misinformation. Existing hallucination evaluation, however, focuses only on "correctness" and often overlooks "consistency", necessary to distinguish and address these harms. To bridge this gap, we introduce _prompt multiplicity_, a framework for quantifying consistency through prompt sensitivity. Our analysis reveals significant multiplicity (over 50% inconsistency in benchmarks like Med-HALT), suggesting that hallucination-related harms have been severely underestimated. Furthermore, we study the role of consistency in hallucination detection and mitigation. We find that: (a) detection techniques capture consistency, not correctness, and (b) mitigation techniques like RAG can introduce additional inconsistencies. By integrating prompt multiplicity into hallucination evaluation, we provide an improved framework of potential harms and uncover critical limitations in current detection and mitigation strategies.

2025-03-04

ICLR.cc/2025/Workshop/BuildingTrust (accepté)

openreview.net

Systemizing Multiplicity: The Curious Case of Arbitrariness in Machine Learning

Afaf Taïk

Algorithmic modeling relies on limited information in data to extrapolate outcomes for unseen scenarios, often embedding an element of arbit… (voir plus)rariness in its decisions. A perspective on this arbitrariness that has recently gained interest is multiplicity-the study of arbitrariness across a set of "good models", i.e., those likely to be deployed in practice. In this work, we systemize the literature on multiplicity by: (a) formalizing the terminology around model design choices and their contribution to arbitrariness, (b) expanding the definition of multiplicity to incorporate underrepresented forms beyond just predictions and explanations, (c) clarifying the distinction between multiplicity and other lenses of arbitrariness, i.e., uncertainty and variance, and (d) distilling the benefits and potential risks of multiplicity into overarching trends, situating it within the broader landscape of responsible AI. We conclude by identifying open research questions and highlighting emerging trends in this young but rapidly growing area of research.

2025-01-23

ArXiv (prépublication)

arxiv.org

Towards More Realistic Extraction Attacks: An Adversarial Perspective

Yash More

Language models are prone to memorizing their training data, making them vulnerable to extraction attacks. While existing research often exa… (voir plus)mines isolated setups, such as a single model or a fixed prompt, real-world adversaries have a considerably larger attack surface due to access to models across various sizes and checkpoints, and repeated prompting. In this paper, we revisit extraction attacks from an adversarial perspective -- with multi-faceted access to the underlying data. We find significant churn in extraction trends, i.e., even unintuitive changes to the prompt, or targeting smaller models and earlier checkpoints, can extract distinct information. By combining multiple attacks, our adversary doubles (

2024-12-31

Trans. Assoc. Comput. Linguistics (publié)

arxiv.org

The Cost of Arbitrariness for Individuals: Examining the Legal and Technical Challenges of Model Multiplicity

Ihsan Ibrahim Daldaban

Ignacio Cofone

Model multiplicity, the phenomenon where multiple models achieve similar performance despite different underlying learned functions, introdu… (voir plus)ces arbitrariness in model selection. While this arbitrariness may seem inconsequential in expectation, its impact on individuals can be severe. This paper explores various individual concerns stemming from multiplicity, including the effects of arbitrariness beyond final predictions, disparate arbitrariness for individuals belonging to protected groups, and the challenges associated with the arbitrariness of a single algorithmic system creating a monopoly across various contexts. It provides both an empirical examination of these concerns and a comprehensive analysis from the legal standpoint, addressing how these issues are perceived in the anti-discrimination law in Canada. We conclude the discussion with technical challenges in the current landscape of model multiplicity to meet legal requirements and the legal gap between current law and the implications of arbitrariness in model selection, highlighting relevant future research directions for both disciplines.

2024-05-27

ArXiv (prépublication)