Publications

VulEXplaineR: XAI for Vulnerability Detection on Assembly Code
Samaneh Mahdavifar
Mohd Saqib
Benjamin C. M. Fung
Philippe Charland
Andrew Walenstein
What is Your Favorite Gender, MLM? Gender Bias Evaluation in Multilingual Masked Language Models
Emily M. Bender
Jeongrok Yu
Timnit Gebru
Seong Ug Kim
Angelina McMillan-642
Jacob Choi
Jinho D. Choi
Su Lin Blodgett
Solon Barocas
Hal Daumé III
Gilsinia Lopez
A.R. Olteanu
Robert Sim
Hanna Wallach. 2021
Stereotyp-657
Bias is a disproportionate prejudice in favor of one side against another. Due to the success of transformer-based Masked Language Models (M… (see more)LMs) and their impact on many NLP tasks, a systematic evaluation of bias in these models is needed more than ever. While many studies have evaluated gender bias in English MLMs, only a few works have been conducted for the task in other languages. This paper proposes a multilingual approach to estimate gender bias in MLMs from 5 languages: Chinese, English, German, Portuguese, and Spanish. Unlike previous work, our approach does not depend on parallel corpora coupled with English to detect gender bias in other languages using multilingual lexicons. Moreover, a novel model-based method is presented to generate sentence pairs for a more robust analysis of gender bias, compared to the traditional lexicon-based method. For each language, both the lexicon-based and model-based methods are applied to create two datasets respectively, which are used to evaluate gender bias in an MLM specifically trained for that language using one existing and 3 new scoring metrics. Our results show that the previous approach is data-sensitive and not stable as it does not remove contextual dependencies irrelevant to gender. In fact, the results often flip when different scoring metrics are used on the same dataset, suggesting that gender bias should be studied on a large dataset using multiple evaluation metrics for best practice.
When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning
Claas Voelcker
Igor Gilitschenski
We investigate the impact of auxiliary learning tasks such as observation reconstruction and latent self-prediction on the representation le… (see more)arning problem in reinforcement learning. We also study how they interact with distractions and observation functions in the MDP. We provide a theoretical analysis of the learning dynamics of observation reconstruction, latent self-prediction, and TD learning in the presence of distractions and observation functions under linear model assumptions. With this formalization, we are able to explain why latent-self prediction is a helpful \emph{auxiliary task}, while observation reconstruction can provide more useful features when used in isolation. Our empirical analysis shows that the insights obtained from our learning dynamics framework predicts the behavior of these loss functions beyond the linear model assumption in non-linear neural networks. This reinforces the usefulness of the linear model framework not only for theoretical analysis, but also practical benefit for applied problems.
Winning the 2023 CityLearn Challenge: A Community-Based Hierarchical Energy Systems Coordination Algorithm
Andoni I. Garmendia
Francesco Morri
Hélène Le Cadre
. The effective management and control of building energy systems are crucial for reducing the energy consumption peak loads, CO 2 emissions… (see more), and ensuring the stability of the power grid, while maintaining optimal comfort levels within buildings. The difficulty to accommodate this trade-off is amplified by dynamic environmental conditions and the need for scalable solutions that can adapt across various building types and geographic locations. Acknowledging the importance of this problem, NeurIPS conference hosted since 2020 the CityLearn control challenge to foster the design of innovative solutions in building energy management. Participants were tasked with developing strategies that not only enhance energy efficiency but also prioritize sustainability and occupant comfort. This paper introduces the Community-based Hierarchical Energy Systems Co-ordination Algorithm ( CHESCA ), the winning approach of the 2023 edition. We rely on a hierarchical approach adaptable to an arbitrary number of buildings, first optimizing building-level metrics individually, and later refining these through a central community-level controller to improve grid-related metrics. Compared to the other high-ranked competitors, our approach demonstrated fast inference capabilities like learning-based methods, while offering a better interpretability and a superior generalization capabilities with minimal data requirements. This paper details our approach, supported by comprehensive experimental results and ablation studies.
Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models
Pablo Pernias
Dominic Rampas
Christopher Pal
Marc Aubreville
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Joao Monteiro
Étienne Marcotte
Pierre-Andre Noel
Valentina Zantedeschi
Christopher Pal
In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference informati… (see more)on. Just-in-time processing of a context is inefficient due to the quadratic cost of self-attention operations, and caching is desirable. However, caching transformer states can easily require almost as much space as the model parameters. When the right context isn't known in advance, caching ICL can be challenging. This work addresses these limitations by introducing models that, inspired by the encoder-decoder architecture, use cross-attention to condition generation on reference text without the prompt. More precisely, we leverage pre-trained decoder-only models and only train a small number of added layers. We use Question-Answering (QA) as a testbed to evaluate the ability of our models to perform conditional generation and observe that they outperform ICL, are comparable to fine-tuned prompted LLMs, and drastically reduce the space footprint relative to standard KV caching by two orders of magnitude.
Penalties and Rewards for Fair Learning in Paired Kidney Exchange Programs
Alison Caulfield
Yi Lin
Adrian Vetta
A kidney exchange program, also called a kidney paired donation program, can be viewed as a repeated, dynamic trading and allocation mechani… (see more)sm. This suggests that a dynamic algorithm for transplant exchange selection may have superior performance in comparison to the repeated use of a static algorithm. We confirm this hypothesis using a full scale simulation of the Canadian Kidney Paired Donation Program: learning algorithms, that attempt to learn optimal patient-donor weights in advance via dynamic simulations, do lead to improved outcomes. Specifically, our learning algorithms, designed with the objective of fairness (that is, equity in terms of transplant accessibility across cPRA groups), also lead to an increased number of transplants and shorter average waiting times. Indeed, our highest performing learning algorithm improves egalitarian fairness by 10% whilst also increasing the number of transplants by 6% and decreasing waiting times by 24%. However, our main result is much more surprising. We find that the most critical factor in determining the performance of a kidney exchange program is not the judicious assignment of positive weights (rewards) to patient-donor pairs. Rather, the key factor in increasing the number of transplants, decreasing waiting times and improving group fairness is the judicious assignment of a negative weight (penalty) to the small number of non-directed donors in the kidney exchange program.
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
Thomas Krendl Gilbert
Jérémy Scheurer
Javier Rando
Rachel Freedman
Tomasz Korbak
David Lindner
Pedro Freire
Tony Tong Wang
Samuel Marks
Charbel-Raphael Segerie
MICAH CARROLL
Phillip Christoffersen
Mehul Damani
Stewart Slocum
Usman Anwar
Anand Siththaranjan … (see 12 more)
Max Nadeau
Eric J Michaud
Jacob Pfau
Dmitrii Krasheninnikov
Xin Chen
Lauro Langosco
Peter Hase
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
Use of Artificial Intelligence in the Identification and Management of Frailty: A Scoping Review Protocol
Sathya Karunananthan
Arya Rahgozar
Ramtin Hakimjavadi
Hui Yan
Kunal A Dalsania
Howard Bergman
Bishwajit Ghose
Jim LaPlante
Tess McCutcheon
Daniel I McIsaac
S. A. Rahimi
Nadia Sourial
Manpreet Thandi
Sabrina T Wong
Clare Liddy
Behavioural pseudometrics for continuous-time diffusions
Cortical neuroprosthesis-mediated functional ipsilateral control of locomotion in rats with spinal cord hemisection
Elena Massai
Isley De Jesus
Roxanne Drainville
Marina Martinez
Abstract Control of voluntary limb movement is predominantly attributed to the contralateral motor cortex. However, increasi… (see more)ng evidence suggests the involvement of ipsilateral cortical networks in this process, especially in motor tasks requiring bilateral coordination, such as locomotion. In this study, we combined a unilateral thoracic spinal cord injury (SCI) with a cortical neuroprosthetic approach to investigate the functional role of the ipsilateral motor cortex in rat movement through spared contralesional pathways. Our findings reveal that in all SCI rats, stimulation of the ipsilesional motor cortex promoted a bilateral synergy. This synergy involved the elevation of the contralateral foot along with ipsilateral hindlimb extension. Additionally, in two out of seven animals, stimulation of a sub-region of the hindlimb motor cortex modulated ipsilateral hindlimb flexion. Importantly, ipsilateral cortical stimulation delivered after SCI immediately alleviated multiple locomotor and postural deficits, and this effect persisted after ablation of the homologous motor cortex. These results provide strong evidence of a causal link between cortical activation and precise ipsilateral control of hindlimb movement. This study has significant implications for the development of future neuroprosthetic technology and our understanding of motor control in the context of spinal cord injury.
Device-Free Human State Estimation using UWB Multi-Static Radios
Saria Al Lahham
Bobak H. Baghi
Pierre-Yves Lajoie
Amal Feriani
Sachini Herath
Steve Liu
We present a human state estimation framework that allows us to estimate the location, and even the activities, of people in an indoor envir… (see more)onment without the requirement that they carry a specific devices with them. To achieve this"device free"localization we use a small number of low-cost Ultra-Wide Band (UWB) sensors distributed across the environment of interest. To achieve high quality estimation from the UWB signals merely reflected of people in the environment, we exploit a deep network that can learn to make inferences. The hardware setup consists of commercial off-the-shelf (COTS) single antenna UWB modules for sensing, paired with Raspberry PI units for computational processing and data transfer. We make use of the channel impulse response (CIR) measurements from the UWB sensors to estimate the human state - comprised of location and activity - in a given area. Additionally, we can also estimate the number of humans that occupy this region of interest. In our approach, first, we pre-process the CIR data which involves meticulous aggregation of measurements and extraction of key statistics. Afterwards, we leverage a convolutional deep neural network to map the CIRs into precise location estimates with sub-30 cm accuracy. Similarly, we achieve accurate human activity recognition and occupancy counting results. We show that we can quickly fine-tune our model for new out-of-distribution users, a process that requires only a few minutes of data and a few epochs of training. Our results show that UWB is a promising solution for adaptable smart-home localization and activity recognition problems.