Gaps Between Research and Practice When Measuring Representational Harms Caused by LLM-Based Systems
Emma Harvey
Emily Sheng
Su Lin Blodgett
Alexandra Chouldechova
Jean Garcia-Gathright
Hanna Wallach
To facilitate the measurement of representational harms caused by large language model (LLM)-based systems, the NLP research community has p… (see more)roduced and made publicly available numerous measurement instruments, including tools, datasets, metrics, benchmarks, annotation instructions, and other techniques. However, the research community lacks clarity about whether and to what extent these instruments meet the needs of practitioners tasked with developing and deploying LLM-based systems in the real world, and how these instruments could be improved. Via a series of semi-structured interviews with practitioners in a variety of roles in different organizations, we identify four types of challenges that prevent practitioners from effectively using publicly available instruments for measuring representational harms caused by LLM-based systems: (1) challenges related to using publicly available measurement instruments; (2) challenges related to doing measurement in practice; (3) challenges arising from measurement tasks involving LLM-based systems; and (4) challenges specific to measuring representational harms. Our goal is to advance the development of instruments for measuring representational harms that are well-suited to practitioner needs, thus better facilitating the responsible development and deployment of LLM-based systems.
Gaps Between Research and Practice When Measuring Representational Harms Caused by LLM-Based Systems
Emma Harvey
Emily Sheng
Su Lin Blodgett
Alexandra Chouldechova
Jean Garcia-Gathright
Hanna Wallach
To facilitate the measurement of representational harms caused by large language model (LLM)-based systems, the NLP research community has p… (see more)roduced and made publicly available numerous measurement instruments, including tools, datasets, metrics, benchmarks, annotation instructions, and other techniques. However, the research community lacks clarity about whether and to what extent these instruments meet the needs of practitioners tasked with developing and deploying LLM-based systems in the real world, and how these instruments could be improved. Via a series of semi-structured interviews with practitioners in a variety of roles in different organizations, we identify four types of challenges that prevent practitioners from effectively using publicly available instruments for measuring representational harms caused by LLM-based systems: (1) challenges related to using publicly available measurement instruments; (2) challenges related to doing measurement in practice; (3) challenges arising from measurement tasks involving LLM-based systems; and (4) challenges specific to measuring representational harms. Our goal is to advance the development of instruments for measuring representational harms that are well-suited to practitioner needs, thus better facilitating the responsible development and deployment of LLM-based systems.
Gaps Between Research and Practice When Measuring Representational Harms Caused by LLM-Based Systems
Emma Harvey
Emily Sheng
Su Lin Blodgett
Alexandra Chouldechova
Jean Garcia-Gathright
Hanna Wallach
To facilitate the measurement of representational harms caused by large language model (LLM)-based systems, the NLP research community has p… (see more)roduced and made publicly available numerous measurement instruments, including tools, datasets, metrics, benchmarks, annotation instructions, and other techniques. However, the research community lacks clarity about whether and to what extent these instruments meet the needs of practitioners tasked with developing and deploying LLM-based systems in the real world, and how these instruments could be improved. Via a series of semi-structured interviews with practitioners in a variety of roles in different organizations, we identify four types of challenges that prevent practitioners from effectively using publicly available instruments for measuring representational harms caused by LLM-based systems: (1) challenges related to using publicly available measurement instruments; (2) challenges related to doing measurement in practice; (3) challenges arising from measurement tasks involving LLM-based systems; and (4) challenges specific to measuring representational harms. Our goal is to advance the development of instruments for measuring representational harms that are well-suited to practitioner needs, thus better facilitating the responsible development and deployment of LLM-based systems.
Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance
Reyhane Askari Hemmat
Melissa Hall
Alicia Sun
Candace Ross
Michal Drozdzal
"It was 80% me, 20% AI": Seeking Authenticity in Co-Writing with Large Language Models
Angel Hsing-Chi Hwang
Q. V. Liao
Su Lin Blodgett
Adam Trischler
"It was 80% me, 20% AI": Seeking Authenticity in Co-Writing with Large Language Models
Angel Hsing-Chi Hwang
Q. V. Liao
Su Lin Blodgett
Adam Trischler
Given the rising proliferation and diversity of AI writing assistance tools, especially those powered by large language models (LLMs), both … (see more)writers and readers may have concerns about the impact of these tools on the authenticity of writing work. We examine whether and how writers want to preserve their authentic voice when co-writing with AI tools and whether personalization of AI writing support could help achieve this goal. We conducted semi-structured interviews with 19 professional writers, during which they co-wrote with both personalized and non-personalized AI writing-support tools. We supplemented writers' perspectives with opinions from 30 avid readers about the written work co-produced with AI collected through an online survey. Our findings illuminate conceptions of authenticity in human-AI co-creation, which focus more on the process and experience of constructing creators' authentic selves. While writers reacted positively to personalized AI writing tools, they believed the form of personalization needs to target writers' growth and go beyond the phase of text production. Overall, readers' responses showed less concern about human-AI co-writing. Readers could not distinguish AI-assisted work, personalized or not, from writers' solo-written work and showed positive attitudes toward writers experimenting with new technology for creative writing.
Effectiveness of primary repair for low anorectal malformations in Uganda.
Felix Oyania
Sarah Ullrich
Zane Hellmann
Caroline Q. Stephens
Meera Kotagal
Sarah Jane Commander
Amy M. Shui
Martin Situma
Charles Newton Odongo
Olivia Kituuka
Francis Bajunirwe
Doruk Ozgediz
Effectiveness of primary repair for low anorectal malformations in Uganda.
Felix Oyania
Sarah Ullrich
Zane J. Hellmann
Caroline Q. Stephens
Meera Kotagal
Sarah Jane Commander
Amy M. Shui
Martin Situma
Charles Newton Odongo
Olivia Kituuka
Francis Bajunirwe
Doruk Ozgediz
Effectiveness of primary repair for low anorectal malformations in Uganda.
Felix Oyania
Sarah Ullrich
Zane Hellmann
Caroline Q. Stephens
Meera Kotagal
Sarah Jane Commander
Amy M. Shui
Martin Situma
Charles Newton Odongo
Olivia Kituuka
Francis Bajunirwe
Doruk Ozgediz
Exploring the Manifold of Neural Networks Using Diffusion Geometry
Elliott Abel
Peyton Crevasse
Yvan Grinspan
Selma Mazioud
Folu Ogundipe
Kristof Reimann
Ellie Schueler
Andrew J. Steindl
Ellen Zhang
Dhananjay Bhaskar
Siddharth Viswanath
Yanlei Zhang
Tim G. J. Rudner
Ian Adelstein
Drawing motivation from the manifold hypothesis, which posits that most high-dimensional data lies on or near low-dimensional manifolds, we … (see more)apply manifold learning to the space of neural networks. We learn manifolds where datapoints are neural networks by introducing a distance between the hidden layer representations of the neural networks. These distances are then fed to the non-linear dimensionality reduction algorithm PHATE to create a manifold of neural networks. We characterize this manifold using features of the representation, including class separation, hierarchical cluster structure, spectral entropy, and topological structure. Our analysis reveals that high-performing networks cluster together in the manifold, displaying consistent embedding patterns across all these features. Finally, we demonstrate the utility of this approach for guiding hyperparameter optimization and neural architecture search by sampling from the manifold.
Exploring the Manifold of Neural Networks Using Diffusion Geometry
Elliott Abel
Peyton Crevasse
Yvan Grinspan
Selma Mazioud
Folu Ogundipe
Kristof Reimann
Ellie Schueler
Andrew J. Steindl
Ellen Zhang
Dhananjay Bhaskar
Siddharth Viswanath
Yanlei Zhang
Tim G. J. Rudner
Ian Adelstein
Drawing motivation from the manifold hypothesis, which posits that most high-dimensional data lies on or near low-dimensional manifolds, we … (see more)apply manifold learning to the space of neural networks. We learn manifolds where datapoints are neural networks by introducing a distance between the hidden layer representations of the neural networks. These distances are then fed to the non-linear dimensionality reduction algorithm PHATE to create a manifold of neural networks. We characterize this manifold using features of the representation, including class separation, hierarchical cluster structure, spectral entropy, and topological structure. Our analysis reveals that high-performing networks cluster together in the manifold, displaying consistent embedding patterns across all these features. Finally, we demonstrate the utility of this approach for guiding hyperparameter optimization and neural architecture search by sampling from the manifold.
Sketch-guided Cage-based 3D Gaussian Splatting Deformation
Tianhao Xie
Tiberiu Popa
3D Gaussian Splatting (GS) is one of the most promising novel 3D representations that has received great interest in computer graphics and c… (see more)omputer vision. While various systems have introduced editing capabilities for 3D GS, such as those guided by text prompts, fine-grained control over deformation remains an open challenge. In this work, we present a novel sketch-guided 3D GS deformation system that allows users to intuitively modify the geometry of a 3D GS model by drawing a silhouette sketch from a single viewpoint. Our approach introduces a new deformation method that combines cage-based deformations with a variant of Neural Jacobian Fields, enabling precise, fine-grained control. Additionally, it leverages large-scale 2D diffusion priors and ControlNet to ensure the generated deformations are semantically plausible. Through a series of experiments, we demonstrate the effectiveness of our method and showcase its ability to animate static 3D GS models as one of its key applications.