Publications

Two Families of Indexable Partially Observable Restless Bandits and Whittle Index Computation
Nima Akbarzadeh
Uncertainty-aware hybrid paradigm of nonlinear MPC and model-based RL for offroad navigation: Exploration of transformers in the predictive model
Faraz Lotfi
Khalil Virji
Farnoosh Faraji
Lucas Berry
Andrew Holliday
In this paper, we investigate a hybrid scheme that combines nonlinear model predictive control (MPC) and model-based reinforcement learning … (see more)(RL) for navigation planning of an autonomous model car across offroad, unstructured terrains without relying on predefined maps. Our innovative approach takes inspiration from BADGR, an LSTM-based network that primarily concentrates on environment modeling, but distinguishes itself by substituting LSTM modules with transformers to greatly elevate the performance our model. Addressing uncertainty within the system, we train an ensemble of predictive models and estimate the mutual information between model weights and outputs, facilitating dynamic horizon planning through the introduction of variable speeds. Further enhancing our methodology, we incorporate a nonlinear MPC controller that accounts for the intricacies of the vehicle's model and states. The model-based RL facet produces steering angles and quantifies inherent uncertainty. At the same time, the nonlinear MPC suggests optimal throttle settings, striking a balance between goal attainment speed and managing model uncertainty influenced by velocity. In the conducted studies, our approach excels over the existing baseline by consistently achieving higher metric values in predicting future events and seamlessly integrating the vehicle's kinematic model for enhanced decision-making. The code and the evaluation data are available at https://github.com/FARAZLOTFI/offroad_autonomous_navigation/).
Validation of Vigilance Decline Capability in A Simulated Test Environment: A Preliminary Step Towards Neuroadaptive Control
Andra Mahu
Amandeep Singh
Florian Tambon
Benoit Ouellette
Jean-françois Delisle
Tanya Paul
Alexandre Marois
Philippe Doyon-poulin
Vigilance is the ability to sustain attention. It is crucial in tasks like piloting and driving that involve the ability to sustain attentio… (see more)n. However, cognitive performance often falters with prolonged tasks, leading to reduced efficiency, slower reactions, and increased error likelihood. Identifying and addressing diminished vigilance is essential for enhancing driving safety. Neuro-physiological indicators have shown promising results to monitor vigilance, paving the way for neuroadaptive control of vigilance. In fact, the collection of vigilance-related physiological markers could allow, using neuroadaptive intelligent systems, a real-time adaption of tasks or the presentation of countermeasures to prevent errors that would ensue from such hypovigilant situations. Before reaching this goal, one must however collect valid data truly representative of hypovigilance which, in turn, can be used to develop prediction models of the vigilant state. This study serves as a proof of concept to assess validity of a testbed to induce and measure vigilance decline through a simulated test environment, validating controlled induction, and evaluating its impact on participants’ performance and subjective experiences. In total, 28 participants (10 females, 18 males) aged 18 to 35 (M = 23.75 years), were recruited. All participants held valid driving licenses and had corrected-to-normal vision. Data collection involved Psychomotor Vigilance Task (PVT), Karolinska Sleepiness Scale (KSS) and the Stanford Sleepiness Scale (SSS) along with neuro-physiological specialized equipment: Enobio 8 EEG, Empatica E4, Polar H10 and Tobii Nano Pro eye tracker. Notably, this study is limited to demonstrating the results of PVT, KSS, and SSS, with the aim of assessing the effectiveness of the test setup. Participants self-reported their loss of vigilance by pressing a marker on the steering wheel. To induce hypovigilance, participants drove an automatic car in a low-traffic, monotonous environment for 60 minutes, featuring empty fields of grass and desert, employing specific in-game procedures. The driving task included instructions for lane-keeping, indicator usage, and maintaining speeds of up to 80 km/h, with no traffic lights or stop signs present. Experiments were conducted before lunch, between 9 am and 12 pm, ensuring maximum participant alertness, with instructions to abstain from caffeine, alcohol, nicotine, and cannabis on the experiment day. Results showed that the mean reaction time (RT) increased from 257.7 ms before driving to 276.8 ms after driving, t = 4.82, p .0001, d = -0.61 whereas the median RT changed from 246.07 ms to 260.89 ms, t = 3.58, p = 0.0013, d= -0.53 indicating a statistically significant alteration in participant's psychomotor performance. The mean number of minor lapses in attention (RT >500ms) to the PVT increased from 1.11 before driving to 1.67 after driving, but was not statistically significant t = 1.66, p = 0.11, d = -0.28. KSS showed a considerable rise of sleepiness, with a mean of 4.11 (rather alert) before driving increasing to 5.96 (some signs of sleepiness) after driving, t = 5.65, p .0001, d = -1.04. Similarly, the SSS demonstrated an increase in mean values from 2.57 (able to concentrate) before driving to 3.96 (somewhat foggy) after driving, t = 8.42, p .0001, d = -1.20, signifying an increased perception of sleepiness following the driving activity. Lastly, the mean time of the first marker press was 17:38 minutes (SD = 9:47 minutes) indicating that the self-reported loss of vigilance occurred during the first 30 minutes of the driving task. The observed increase in PVT reaction time aligns with the declined alertness reported on both the KSS and SSS responses, suggesting a consistent decline in vigilance and alertness post-driving. In conclusion, the study underscores the effectiveness and validity of the simulated test environment in inducing vigilance decline, providing valuable insights into the impact on both objective and subjective measures. At the same time, the research sets the stage for exploring neuroadaptive control strategies, aiming to enhance task performance and safety. Ultimately, this will contribute to the development of a non-invasive artificial intelligence system capable of detecting vigilance states in extreme/challenging environments, e.g. for pilots and drivers.
VulEXplaineR: XAI for Vulnerability Detection on Assembly Code
Samaneh Mahdavifar
Mohd Saqib
Philippe Charland
Andrew Walenstein
What is Your Favorite Gender, MLM? Gender Bias Evaluation in Multilingual Masked Language Models
Emily M. Bender
Jeongrok Yu
Timnit Gebru
Seong Ug Kim
Angelina McMillan-642
Jacob Choi
Jinho D. Choi
Su Lin Blodgett
Solon Barocas
Hal Daumé III
Gilsinia Lopez
Robert Sim
Hanna Wallach. 2021
Stereotyp-657
Bias is a disproportionate prejudice in favor of one side against another. Due to the success of transformer-based Masked Language Models (M… (see more)LMs) and their impact on many NLP tasks, a systematic evaluation of bias in these models is needed more than ever. While many studies have evaluated gender bias in English MLMs, only a few works have been conducted for the task in other languages. This paper proposes a multilingual approach to estimate gender bias in MLMs from 5 languages: Chinese, English, German, Portuguese, and Spanish. Unlike previous work, our approach does not depend on parallel corpora coupled with English to detect gender bias in other languages using multilingual lexicons. Moreover, a novel model-based method is presented to generate sentence pairs for a more robust analysis of gender bias, compared to the traditional lexicon-based method. For each language, both the lexicon-based and model-based methods are applied to create two datasets respectively, which are used to evaluate gender bias in an MLM specifically trained for that language using one existing and 3 new scoring metrics. Our results show that the previous approach is data-sensitive and not stable as it does not remove contextual dependencies irrelevant to gender. In fact, the results often flip when different scoring metrics are used on the same dataset, suggesting that gender bias should be studied on a large dataset using multiple evaluation metrics for best practice.
Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models
Pablo Pernias
Dominic Rampas
Mats Leon Richter
Marc Aubreville
Penalties and Rewards for Fair Learning in Paired Kidney Exchange Programs
Alison Caulfield
Yi Lin
Adrian Vetta
A kidney exchange program, also called a kidney paired donation program, can be viewed as a repeated, dynamic trading and allocation mechani… (see more)sm. This suggests that a dynamic algorithm for transplant exchange selection may have superior performance in comparison to the repeated use of a static algorithm. We confirm this hypothesis using a full scale simulation of the Canadian Kidney Paired Donation Program: learning algorithms, that attempt to learn optimal patient-donor weights in advance via dynamic simulations, do lead to improved outcomes. Specifically, our learning algorithms, designed with the objective of fairness (that is, equity in terms of transplant accessibility across cPRA groups), also lead to an increased number of transplants and shorter average waiting times. Indeed, our highest performing learning algorithm improves egalitarian fairness by 10% whilst also increasing the number of transplants by 6% and decreasing waiting times by 24%. However, our main result is much more surprising. We find that the most critical factor in determining the performance of a kidney exchange program is not the judicious assignment of positive weights (rewards) to patient-donor pairs. Rather, the key factor in increasing the number of transplants, decreasing waiting times and improving group fairness is the judicious assignment of a negative weight (penalty) to the small number of non-directed donors in the kidney exchange program.
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
Thomas Krendl Gilbert
Jérémy Scheurer
Javier Rando
Rachel Freedman
Tomasz Korbak
David Lindner
Pedro Freire
Tony Tong Wang
Samuel Marks
Charbel-Raphael Segerie
Micah Carroll
Andi Peng
Phillip Christoffersen
Mehul Damani
Stewart Slocum
Usman Anwar
Anand Siththaranjan … (see 12 more)
Max Nadeau
Eric J Michaud
Jacob Pfau
Dmitrii Krasheninnikov
Xin Chen
Lauro Langosco
Peter Hase
Erdem Biyik
Anca Dragan
Dorsa Sadigh
Dylan Hadfield-Menell
Use of Artificial Intelligence in the Identification and Management of Frailty: A Scoping Review Protocol
Sathya Karunananthan
Arya Rahgozar
Ramtin Hakimjavadi
Hui Yan
Kunal A Dalsania
Howard Bergman
Bishwajit Ghose
Jim LaPlante
Tess McCutcheon
Daniel I McIsaac
Nadia Sourial
Manpreet Thandi
Sabrina T Wong
Clare Liddy
Behavioural pseudometrics for continuous-time diffusions
Linan Chen
Florence Clerc
Device-Free Human State Estimation using UWB Multi-Static Radios
Saria Al Lahham
Bobak H. Baghi
Pierre-Yves Lajoie
Amal Feriani
Sachini Herath
Steve Liu
We present a human state estimation framework that allows us to estimate the location, and even the activities, of people in an indoor envir… (see more)onment without the requirement that they carry a specific devices with them. To achieve this"device free"localization we use a small number of low-cost Ultra-Wide Band (UWB) sensors distributed across the environment of interest. To achieve high quality estimation from the UWB signals merely reflected of people in the environment, we exploit a deep network that can learn to make inferences. The hardware setup consists of commercial off-the-shelf (COTS) single antenna UWB modules for sensing, paired with Raspberry PI units for computational processing and data transfer. We make use of the channel impulse response (CIR) measurements from the UWB sensors to estimate the human state - comprised of location and activity - in a given area. Additionally, we can also estimate the number of humans that occupy this region of interest. In our approach, first, we pre-process the CIR data which involves meticulous aggregation of measurements and extraction of key statistics. Afterwards, we leverage a convolutional deep neural network to map the CIRs into precise location estimates with sub-30 cm accuracy. Similarly, we achieve accurate human activity recognition and occupancy counting results. We show that we can quickly fine-tune our model for new out-of-distribution users, a process that requires only a few minutes of data and a few epochs of training. Our results show that UWB is a promising solution for adaptable smart-home localization and activity recognition problems.
Fairness-Aware Structured Pruning in Transformers
Abdelrahman Zayed
Goncalo Mordido
Samira Shabanian
Ioana Baldini