Publications

Understanding Intrinsic Socioeconomic Biases in Large Language Models

Large Language Models (LLMs) are increasingly integrated into critical decision-making processes, such as loan approvals and visa applicatio… (voir plus)ns, where inherent biases can lead to discriminatory outcomes. In this paper, we examine the nuanced relationship between demographic attributes and socioeconomic biases in LLMs, a crucial yet understudied area of fairness in LLMs. We introduce a novel dataset of one million English sentences to systematically quantify socioeconomic biases across various demographic groups. Our findings reveal pervasive socioeconomic biases in both established models such as GPT-2 and state-of-the-art models like Llama 2 and Falcon. We demonstrate that these biases are significantly amplified when considering intersectionality, with LLMs exhibiting a remarkable capacity to extract multiple demographic attributes from names and then correlate them with specific socioeconomic biases. This research highlights the urgent necessity for proactive and robust bias mitigation techniques to safeguard against discriminatory outcomes when deploying these powerful models in critical real-world applications.

2024-01-01

AIES (1) (publié)

doi.org

arxiv.org

UTG: Towards a Unified View of Snapshot and Event Based Models for Temporal Graphs

Emanuele Rossi

2024-01-01

LoG (publié)

doi.org

openreview.net

Validation of Vigilance Decline Capability in A Simulated Test Environment: A Preliminary Step Towards Neuroadaptive Control

Andra Mahu

Amandeep Singh

Florian Tambon

Benoit Ouellette

Jean-françois Delisle

Tanya Paul

Foutse Khomh

Alexandre Marois

Philippe Doyon-poulin

Vigilance is the ability to sustain attention. It is crucial in tasks like piloting and driving that involve the ability to sustain attentio… (voir plus)n. However, cognitive performance often falters with prolonged tasks, leading to reduced efficiency, slower reactions, and increased error likelihood. Identifying and addressing diminished vigilance is essential for enhancing driving safety. Neuro-physiological indicators have shown promising results to monitor vigilance, paving the way for neuroadaptive control of vigilance. In fact, the collection of vigilance-related physiological markers could allow, using neuroadaptive intelligent systems, a real-time adaption of tasks or the presentation of countermeasures to prevent errors that would ensue from such hypovigilant situations. Before reaching this goal, one must however collect valid data truly representative of hypovigilance which, in turn, can be used to develop prediction models of the vigilant state. This study serves as a proof of concept to assess validity of a testbed to induce and measure vigilance decline through a simulated test environment, validating controlled induction, and evaluating its impact on participants’ performance and subjective experiences. In total, 28 participants (10 females, 18 males) aged 18 to 35 (M = 23.75 years), were recruited. All participants held valid driving licenses and had corrected-to-normal vision. Data collection involved Psychomotor Vigilance Task (PVT), Karolinska Sleepiness Scale (KSS) and the Stanford Sleepiness Scale (SSS) along with neuro-physiological specialized equipment: Enobio 8 EEG, Empatica E4, Polar H10 and Tobii Nano Pro eye tracker. Notably, this study is limited to demonstrating the results of PVT, KSS, and SSS, with the aim of assessing the effectiveness of the test setup. Participants self-reported their loss of vigilance by pressing a marker on the steering wheel. To induce hypovigilance, participants drove an automatic car in a low-traffic, monotonous environment for 60 minutes, featuring empty fields of grass and desert, employing specific in-game procedures. The driving task included instructions for lane-keeping, indicator usage, and maintaining speeds of up to 80 km/h, with no traffic lights or stop signs present. Experiments were conducted before lunch, between 9 am and 12 pm, ensuring maximum participant alertness, with instructions to abstain from caffeine, alcohol, nicotine, and cannabis on the experiment day. Results showed that the mean reaction time (RT) increased from 257.7 ms before driving to 276.8 ms after driving, t = 4.82, p .0001, d = -0.61 whereas the median RT changed from 246.07 ms to 260.89 ms, t = 3.58, p = 0.0013, d= -0.53 indicating a statistically significant alteration in participant's psychomotor performance. The mean number of minor lapses in attention (RT >500ms) to the PVT increased from 1.11 before driving to 1.67 after driving, but was not statistically significant t = 1.66, p = 0.11, d = -0.28. KSS showed a considerable rise of sleepiness, with a mean of 4.11 (rather alert) before driving increasing to 5.96 (some signs of sleepiness) after driving, t = 5.65, p .0001, d = -1.04. Similarly, the SSS demonstrated an increase in mean values from 2.57 (able to concentrate) before driving to 3.96 (somewhat foggy) after driving, t = 8.42, p .0001, d = -1.20, signifying an increased perception of sleepiness following the driving activity. Lastly, the mean time of the first marker press was 17:38 minutes (SD = 9:47 minutes) indicating that the self-reported loss of vigilance occurred during the first 30 minutes of the driving task. The observed increase in PVT reaction time aligns with the declined alertness reported on both the KSS and SSS responses, suggesting a consistent decline in vigilance and alertness post-driving. In conclusion, the study underscores the effectiveness and validity of the simulated test environment in inducing vigilance decline, providing valuable insights into the impact on both objective and subjective measures. At the same time, the research sets the stage for exploring neuroadaptive control strategies, aiming to enhance task performance and safety. Ultimately, this will contribute to the development of a non-invasive artificial intelligence system capable of detecting vigilance states in extreme/challenging environments, e.g. for pilots and drivers.

2024-01-01

Neuroergonomics and Cognitive Engineering (publié)

doi.org

Visual theatrical improvisation alongside Artificial Intelligence image generators.

Piotr Mirowski

Boyd Branch

Kory Wallace Mathewson

2024-01-01

ICCC (publié)

dblp.uni-trier.de

Visual-Tactile Inference of 2.5D Object Shape From Marker Texture

Affan Jilani

Francois Hogan

Charlotte Morissette

Gregory Dudek

M. Jenkin

Kaleem Siddiqi

2024-01-01

IEEE Robotics and Automation Letters (publié)

doi.org

Voices Unheard: NLP Resources and Models for Yor\`ub\'a Regional Dialects

Orevaoghene Ahia

Aremu Anuoluwapo

Diana Abagyan

Hila Gonen

David Ifeoluwa Adelani

Daud Abolade

Noah A. Smith

Yulia Tsvetkov

2024-01-01

EMNLP (publié)

doi.org

arxiv.org

VulEXplaineR: XAI for Vulnerability Detection on Assembly Code

Samaneh Mahdavifar

Mohd Saqib

Benjamin Fung

Philippe Charland

Andrew Walenstein

2024-01-01

ECML/PKDD (publié)

doi.org

What is Your Favorite Gender, MLM? Gender Bias Evaluation in Multilingual Masked Language Models

Emily M. Bender

Jeongrok Yu

Timnit Gebru

Seong Ug Kim

Angelina McMillan-642

Jacob Choi

Jinho D. Choi

Su Lin Blodgett

Solon Barocas

Hal Daumé III

Gilsinia Lopez

Alexandra Olteanu

Robert Sim

Hanna Wallach. 2021

Stereotyp-657

Bias is a disproportionate prejudice in favor of one side against another. Due to the success of transformer-based Masked Language Models (M… (voir plus)LMs) and their impact on many NLP tasks, a systematic evaluation of bias in these models is needed more than ever. While many studies have evaluated gender bias in English MLMs, only a few works have been conducted for the task in other languages. This paper proposes a multilingual approach to estimate gender bias in MLMs from 5 languages: Chinese, English, German, Portuguese, and Spanish. Unlike previous work, our approach does not depend on parallel corpora coupled with English to detect gender bias in other languages using multilingual lexicons. Moreover, a novel model-based method is presented to generate sentence pairs for a more robust analysis of gender bias, compared to the traditional lexicon-based method. For each language, both the lexicon-based and model-based methods are applied to create two datasets respectively, which are used to evaluate gender bias in an MLM specifically trained for that language using one existing and 3 new scoring metrics. Our results show that the previous approach is data-sensitive and not stable as it does not remove contextual dependencies irrelevant to gender. In fact, the results often flip when different scoring metrics are used on the same dataset, suggesting that gender bias should be studied on a large dataset using multiple evaluation metrics for best practice.

2024-01-01

Inf. (publié)

doi.org

arxiv.org

Winning the 2023 CityLearn Challenge: A Community-Based Hierarchical Energy Systems Coordination Algorithm

Andoni I. Garmendia

Francesco Morri

Quentin Cappart

Hélène Le Cadre

. The effective management and control of building energy systems are crucial for reducing the energy consumption peak loads, CO 2 emissions… (voir plus), and ensuring the stability of the power grid, while maintaining optimal comfort levels within buildings. The difﬁculty to accommodate this trade-off is ampliﬁed by dynamic environmental conditions and the need for scalable solutions that can adapt across various building types and geographic locations. Acknowledging the importance of this problem, NeurIPS conference hosted since 2020 the CityLearn control challenge to foster the design of innovative solutions in building energy management. Participants were tasked with developing strategies that not only enhance energy efﬁciency but also prioritize sustainability and occupant comfort. This paper introduces the Community-based Hierarchical Energy Systems Co-ordination Algorithm ( CHESCA ), the winning approach of the 2023 edition. We rely on a hierarchical approach adaptable to an arbitrary number of buildings, ﬁrst optimizing building-level metrics individually, and later reﬁning these through a central community-level controller to improve grid-related metrics. Compared to the other high-ranked competitors, our approach demonstrated fast inference capabilities like learning-based methods, while offering a better interpretability and a superior generalization capabilities with minimal data requirements. This paper details our approach, supported by comprehensive experimental results and ablation studies.

2024-01-01

ECAI (publié)

doi.org

Winning the 2023 CityLearn Challenge: A Community-Based Hierarchical Energy Systems Coordination Algorithm

Andoni I. Garmendia

Francesco Morri

Quentin Cappart

Hélène Le Cadre

. The effective management and control of building energy systems are crucial for reducing the energy consumption peak loads, CO 2 emissions… (voir plus), and ensuring the stability of the power grid, while maintaining optimal comfort levels within buildings. The difﬁculty to accommodate this trade-off is ampliﬁed by dynamic environmental conditions and the need for scalable solutions that can adapt across various building types and geographic locations. Acknowledging the importance of this problem, NeurIPS conference hosted since 2020 the CityLearn control challenge to foster the design of innovative solutions in building energy management. Participants were tasked with developing strategies that not only enhance energy efﬁciency but also prioritize sustainability and occupant comfort. This paper introduces the Community-based Hierarchical Energy Systems Co-ordination Algorithm ( CHESCA ), the winning approach of the 2023 edition. We rely on a hierarchical approach adaptable to an arbitrary number of buildings, ﬁrst optimizing building-level metrics individually, and later reﬁning these through a central community-level controller to improve grid-related metrics. Compared to the other high-ranked competitors, our approach demonstrated fast inference capabilities like learning-based methods, while offering a better interpretability and a superior generalization capabilities with minimal data requirements. This paper details our approach, supported by comprehensive experimental results and ablation studies.

2024-01-01

European Conference on Artificial Intelligence (publié)

doi.org

Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models

Pablo Pernias

Dominic Rampas

Mats Leon Richter

Chris Pal

Marc Aubreville

2024-01-01

International Conference on Learning Representations (publié)

dblp.uni-trier.de