Position: Evaluating Generative AI Systems is a Social Science Measurement Challenge
Hanna Wallach
Meera Desai
A. Feder Cooper
Angelina Wang
Chad Atalla
Solon Barocas
Su Lin Blodgett
Alexandra Chouldechova
Emily Corvi
P. A. Dow
Jean Garcia-Gathright
Nicholas Pangakis
Stefanie Reed
Emily Sheng
Dan Vann
Jennifer Wortman Vaughan
Matthew Vogel
Hannah Washington
Abigail Z. Jacobs
The measurement tasks involved in evaluating generative AI (GenAI) systems are especially difficult, leading to what has been described as"a… (see more) tangle of sloppy tests [and] apples-to-oranges comparisons"(Roose, 2024). In this position paper, we argue that the ML community would benefit from learning from and drawing on the social sciences when developing and using measurement instruments for evaluating GenAI systems. Specifically, our position is that evaluating GenAI systems is a social science measurement challenge. We present a four-level framework, grounded in measurement theory from the social sciences, for measuring concepts related to the capabilities, behaviors, and impacts of GenAI. This framework has two important implications for designing and evaluating evaluations: First, it can broaden the expertise involved in evaluating GenAI systems by enabling stakeholders with different perspectives to participate in conceptual debates. Second, it brings rigor to both conceptual and operational debates by offering a set of lenses for interrogating the validity of measurement instruments and their resulting measurements.
A Scalable Architecture for Future Regenerative Satellite Payloads
Olfa Ben Yahia
Zineb Garroussi
Brunilde Sansò
Jean-François Frigon
Stéphane Martel
Gunes Karabulut Kurt
This paper addresses the limitations of current satellite payload architectures, which are predominantly hardware-driven and lack the flexib… (see more)ility to adapt to increasing data demands and uneven traffic. To overcome these challenges, we present a novel architecture for future regenerative and programmable satellite payloads and utilize interconnected modem banks to promote higher scalability and flexibility. We formulate an optimization problem to efficiently manage traffic among these modem banks and balance the load. Additionally, we provide comparative numerical simulation results, considering end-to-end delay and packet loss analysis. The results illustrate that our proposed architecture maintains lower delays and packet loss even with higher traffic demands and smaller buffer sizes.
The Harmonic Exponential Filter for Nonparametric Estimation on Motion Groups
Miguel Saavedra-Ruiz
Steven A. Parkison
Ria Arora
James Richard Forbes
Bayesian estimation is a vital tool in robotics as it allows systems to update the robot state belief using incomplete information from nois… (see more)y sensors. To render the state estimation problem tractable, many systems assume that the motion and measurement noise, as well as the state distribution, are unimodal and Gaussian. However, there are numerous scenarios and systems that do not comply with these assumptions. Existing nonparametric filters that are used to model multimodal distributions have drawbacks that limit their ability to represent a diverse set of distributions. This paper introduces a novel approach to nonparametric Bayesian filtering on motion groups, designed to handle multimodal distributions using harmonic exponential distributions. This approach leverages two key insights of harmonic exponential distributions: a) the product of two distributions can be expressed as the element-wise addition of their log-likelihood Fourier coefficients, and b) the convolution of two distributions can be efficiently computed as the tensor product of their Fourier coefficients. These observations enable the development of an efficient and asymptotically exact solution to the Bayes filter up to the band limit of a Fourier transform. We demonstrate our filter's performance compared with established nonparametric filtering methods across simulated and real-world localization tasks.
Visual-Tactile Inference of 2.5D Object Shape From Marker Texture
Affan Jilani
Francois Hogan
Charlotte Morissette
M. Jenkin
Visual-tactile sensing affords abundant capabilities for contact-rich object manipulation tasks including grasping and placing. Here we intr… (see more)oduce a shape-from-texture inspired contact shape estimation approach for visual-tactile sensors equipped with visually distinct membrane markers. Under a perspective projection camera model, measurements related to the change in marker separation upon contact are used to recover surface shape. Our approach allows for shape sensing in real time, without requiring network training or complex assumptions related to lighting, sensor geometry or marker placement. Experiments show that the surface contact shape recovered is qualitatively and quantitatively consistent with those obtained through the use of photometric stereo, the current state of the art for shape recovery in visual-tactile sensors. Importantly, our approach is applicable to a large family of sensors not equipped with photometric stereo hardware, and also to those with semi-transparent membranes. The recovery of surface shape affords new capabilities to these sensors for robotic applications, such as the estimation of contact and slippage in object manipulation tasks (Hogan etal., 2022) and the use of force matching for kinesthetic teaching using multimodal visual-tactile sensing (Ablett etal., 2024).
Visual-Tactile Inference of 2.5D Object Shape From Marker Texture
Affan Jilani
Francois Hogan
Charlotte Morissette
M. Jenkin
Visual-tactile sensing affords abundant capabilities for contact-rich object manipulation tasks including grasping and placing. Here we intr… (see more)oduce a shape-from-texture inspired contact shape estimation approach for visual-tactile sensors equipped with visually distinct membrane markers. Under a perspective projection camera model, measurements related to the change in marker separation upon contact are used to recover surface shape. Our approach allows for shape sensing in real time, without requiring network training or complex assumptions related to lighting, sensor geometry or marker placement. Experiments show that the surface contact shape recovered is qualitatively and quantitatively consistent with those obtained through the use of photometric stereo, the current state of the art for shape recovery in visual-tactile sensors. Importantly, our approach is applicable to a large family of sensors not equipped with photometric stereo hardware, and also to those with semi-transparent membranes. The recovery of surface shape affords new capabilities to these sensors for robotic applications, such as the estimation of contact and slippage in object manipulation tasks (Hogan etal., 2022) and the use of force matching for kinesthetic teaching using multimodal visual-tactile sensing (Ablett etal., 2024).
Diminished social memory and hippocampal correlates of social interactions in chronic social defeat stress susceptibility
Amanda Larosa
Tian Rui Zhang
Alice S. Wong
Cyrus Y.H. Fung
Y. H. Fung Cyrus
Xiong Ling Yun (Jenny) Long
Prabhjeet Singh
Tak Pan Wong
Towards Multi-Brain Decoding in Autism: A Self-Supervised Learning Approach
Ghazaleh Ranjabaran
Quentin Moreau
Adrien Dubois
This study introduces a self-supervised learning (SSL) approach to hyperscanning electroencephalography (EEG) data, targeting the identifica… (see more)tion of autism spectrum condition (ASC) during social interactions. Hyperscanning enables simultaneous recording of neural activity across interacting individuals, offering a novel path for studying brain-to-brain synchrony in ASC. Leveraging a large-scale, single-brain EEG dataset for SSL pretraining, we developed a multi-brain classification model fine-tuned with hyperscanning data from dyadic interactions involving ASC and neurotypical participants. The SSL model demonstrated superior performance (78.13% accuracy) compared to supervised baselines and logistic regression using spectral EEG biomarkers. These results underscore the efficacy of SSL in addressing the challenges of limited labeled data, enhancing EEG-based diagnostic tools for ASC, and advancing research in social neuroscience.
La communication financière à l’épreuve de la crise COVID : une gestion des impressions ?
Corinne Bessieux-Ollier
Grégoire Davrinche
Nous étudions l’impact de la crise du COVID-19 sur la gestion des impressions pratiquée par les entreprises françaises cotées. Cette c… (see more)rise ayant eu un impact fort sur l’activité des entreprises, nous observons si les dirigeants modifient la manière de présenter l’information liée aux résultats non-GAAP, à travers l’utilisation de stratégies d’obscurcissement. Les données sur la gestion des impressions ont été collectées manuellement dans les communiqués de résultats annuels des entreprises du SBF 120 sur la période 2018-2020. Nous constatons une diminution générale du niveau de gestion des impressions en période de crise, notamment pour les entreprises des secteurs ayant été les plus impactés par la crise COVID. Cette diminution est toutefois moins prononcée pour les entreprises ayant sous-performé par rapport à leur secteur d’activité et pour les entreprises dont la performance a le plus diminué (indépendamment du secteur auquel elles appartiennent). Nos résultats suggèrent que les entreprises dont la baisse de performance pourrait être attribuée à des causes internes (résultats très défavorables, résultats en deçà du secteur d’activité) demeurent soucieuses de l’image qu’elles renvoient et maintiennent leur niveau de gestion des impressions malgré la crise.
TEARS: Text Representations for Scrutable Recommendations
Emiliano Penaloza
Olivier Gouvert
Haolun Wu
Traditional recommender systems rely on high-dimensional (latent) embeddings for modeling user-item interactions, often resulting in opaque … (see more)representations that lack interpretability. Moreover, these systems offer limited control to users over their recommendations. Inspired by recent work, we introduce TExtuAl Representations for Scrutable recommendations (TEARS) to address these challenges. Instead of representing a user’s interests through latent embed- dings, TEARS encodes them in natural text, providing transparency and allowing users to edit them. To encode such preferences, we use modern LLMs to generate high-quality user summaries which we find uniquely capture user preferences. Using these summaries we take a hybrid approach where we use an optimal transport procedure to align the summaries’ representations with the repre- sentation of a standard VAE for collaborative filtering. We find this approach can surpass the performance of the three popular VAE models while providing user-controllable recommendations. We further analyze the controllability of TEARS through three simu- lated user tasks to evaluate the effectiveness of user edits on their summaries. Our code and all user-summaries can be seen in an anonymized repository.
A Data-driven Discovery of the Causal Connection between Galaxy and Black Hole Evolution
Zehao Jin
Mario Pasquato
Benjamin L. Davis
Tristan Deleu
Yu Luo
Changhyun Cho
Pablo Lemos
Xi 熙 Kang 康
Andrea Maccio
Associations between circulating amino acids and metabolic dysfunction‐associated steatotic liver disease in individuals living with severe obesity
Ina Maltais‐Payette
Jérôme Bourgault
Marie‐Frédérique Gauthier
Laurent Biertho
Simon Marceau
François Julien
Patricia L. Mitchell
Christian Couture
Francis Brière
Benoit J. Arsenault
André Tchernof
ECLARE: multi-teacher contrastive learning via ensemble distillation for diagonal integration of single-cell multi-omic data
Dylan Mann-Krzisnik
Integrating multimodal single-cell data, such as scRNA-seq and scATAC-seq, is key for decoding gene regulatory networks but remains challeng… (see more)ing due to issues like feature harmonization and limited quantity of paired data. To address these challenges, we introduce ECLARE, a novel framework combining multi-teacher ensemble knowledge distillation with contrastive learning for diagonal integration of single-cell multi-omic data. ECLARE trains teacher models on paired datasets to guide a student model for unpaired data, leveraging a refined contrastive objective and transport-based loss for precise cross-modality alignment. Experiments demonstrate ECLARE’s competitive performance in cell pairing accuracy, multimodal integration and biological structure preservation, indicating that multi-teacher knowledge distillation provides an effective mean to improve a diagonal integration model beyond its zero-shot capabilities. Additionally, we validate ECLARE’s applicability through a case study on major depressive disorder (MDD) data, illustrating its capability to reveal gene regulatory insights from unpaired nuclei. While current results highlight the potential of ensemble distillation in multi-omic analyses, future work will focus on optimizing model complexity, dataset scalability, and exploring applications in diverse multi-omic contexts. ECLARE establishes a robust foundation for biologically informed single-cell data integration, facilitating advanced downstream analyses and scaling multi-omic data for training advanced machine learning models.