Publications

Penalties and Rewards for Fair Learning in Paired Kidney Exchange Programs

Alison Caulfield

Yi Lin

Adrian Vetta

A kidney exchange program, also called a kidney paired donation program, can be viewed as a repeated, dynamic trading and allocation mechani… (see more)sm. This suggests that a dynamic algorithm for transplant exchange selection may have superior performance in comparison to the repeated use of a static algorithm. We confirm this hypothesis using a full scale simulation of the Canadian Kidney Paired Donation Program: learning algorithms, that attempt to learn optimal patient-donor weights in advance via dynamic simulations, do lead to improved outcomes. Specifically, our learning algorithms, designed with the objective of fairness (that is, equity in terms of transplant accessibility across cPRA groups), also lead to an increased number of transplants and shorter average waiting times. Indeed, our highest performing learning algorithm improves egalitarian fairness by 10% whilst also increasing the number of transplants by 6% and decreasing waiting times by 24%. However, our main result is much more surprising. We find that the most critical factor in determining the performance of a kidney exchange program is not the judicious assignment of positive weights (rewards) to patient-donor pairs. Rather, the key factor in increasing the number of transplants, decreasing waiting times and improving group fairness is the judicious assignment of a negative weight (penalty) to the small number of non-directed donors in the kidney exchange program.

2023-12-31

Web and Internet Economics (published)

doi.org

arxiv.org

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Stephen Casper

Xander Davies

Claudia Shi

Thomas Krendl Gilbert

Jérémy Scheurer

Javier Rando

Rachel Freedman

Tomasz Korbak

David Lindner

Pedro Freire

Tony Tong Wang

Samuel Marks

Charbel-Raphael Segerie

Micah Carroll

Andi Peng

Phillip Christoffersen

Mehul Damani

Stewart Slocum

Usman Anwar

Anand Siththaranjan … (see 12 more)

Max Nadeau

Eric J Michaud

Jacob Pfau

Dmitrii Krasheninnikov

Xin Chen

Lauro Langosco

Peter Hase

Erdem Biyik

Anca Dragan

David Scott Krueger

Dorsa Sadigh

Dylan Hadfield-Menell

2023-12-30

TMLR (accepted)

doi.org

openreview.net

Use of Artificial Intelligence in the Identification and Management of Frailty: A Scoping Review Protocol

Sathya Karunananthan

Arya Rahgozar

Ramtin Hakimjavadi

Hui Yan

Kunal A Dalsania

Howard Bergman

Bishwajit Ghose

Jim LaPlante

Tess McCutcheon

Daniel I McIsaac

Samira Abbasgholizadeh-Rahimi

Nadia Sourial

Manpreet Thandi

Sabrina T Wong

Clare Liddy

2023-12-28

BMJ Open (published)

doi.org

Behavioural pseudometrics for continuous-time diffusions

Linan Chen

Florence Clerc

Prakash Panangaden

2023-12-27

ArXiv (preprint)

doi.org

arxiv.org

Device-Free Human State Estimation using UWB Multi-Static Radios

Saria Al Laham

Bobak H. Baghi

Pierre-Yves Lajoie

Amal Feriani

Sachini Herath

Steve Liu

Gregory Dudek

We present a human state estimation framework that allows us to estimate the location, and even the activities, of people in an indoor envir… (see more)onment without the requirement that they carry a specific devices with them. To achieve this"device free"localization we use a small number of low-cost Ultra-Wide Band (UWB) sensors distributed across the environment of interest. To achieve high quality estimation from the UWB signals merely reflected of people in the environment, we exploit a deep network that can learn to make inferences. The hardware setup consists of commercial off-the-shelf (COTS) single antenna UWB modules for sensing, paired with Raspberry PI units for computational processing and data transfer. We make use of the channel impulse response (CIR) measurements from the UWB sensors to estimate the human state - comprised of location and activity - in a given area. Additionally, we can also estimate the number of humans that occupy this region of interest. In our approach, first, we pre-process the CIR data which involves meticulous aggregation of measurements and extraction of key statistics. Afterwards, we leverage a convolutional deep neural network to map the CIRs into precise location estimates with sub-30 cm accuracy. Similarly, we achieve accurate human activity recognition and occupancy counting results. We show that we can quickly fine-tune our model for new out-of-distribution users, a process that requires only a few minutes of data and a few epochs of training. Our results show that UWB is a promising solution for adaptable smart-home localization and activity recognition problems.

2023-12-26

ArXiv (preprint)

doi.org

arxiv.org

Fairness-Aware Structured Pruning in Transformers

A. Zayed

Goncalo Mordido

Samira Shabanian

Ioana Baldini

Sarath Chandar Anbil Parthipan

2023-12-24

ArXiv (preprint)

doi.org

arxiv.org

Harnessing Pre-trained Generalist Agents for Software Engineering Tasks

Paulina Stevia Nouwou Mindom

Amin Nikanjam

Foutse Khomh

Nowadays, we are witnessing an increasing adoption of Artificial Intelligence (AI) to develop techniques aimed at improving the reliability,… (see more) effectiveness, and overall quality of software systems. Deep reinforcement learning (DRL) has recently been successfully used for automation in complex tasks such as game testing and solving the job-shop scheduling problem. However, these specialized DRL agents, trained from scratch on specific tasks, suffer from a lack of generalizability to other tasks and they need substantial time to be developed and re-trained effectively. Recently, DRL researchers have begun to develop generalist agents, able to learn a policy from various environments and capable of achieving performances similar to or better than specialist agents in new tasks. In the Natural Language Processing or Computer Vision domain, these generalist agents are showing promising adaptation capabilities to never-before-seen tasks after a light fine-tuning phase and achieving high performance. This paper investigates the potential of generalist agents for solving SE tasks. Specifically, we conduct an empirical study aimed at assessing the performance of two generalist agents on two important SE tasks: the detection of bugs in games (for two games) and the minimization of makespan in a scheduling task, to solve the job-shop scheduling problem (for two instances). Our results show that the generalist agents outperform the specialist agents with very little effort for fine-tuning, achieving a 20% reduction of the makespan over specialized agent performance on task-based scheduling. In the context of game testing, some generalist agent configurations detect 85% more bugs than the specialist agents. Building on our analysis, we provide recommendations for researchers and practitioners looking to select generalist agents for SE tasks, to ensure that they perform effectively.

2023-12-24

ArXiv (preprint)

doi.org

arxiv.org

Neural manifolds and learning regimes in neural-interface tasks

Alexandre Payeur

Amy L. Orsborn

Guillaume Lajoie

2023-12-23

bioRxiv (preprint)

doi.org

GROOD: GRadient-aware Out-Of-Distribution detection in interpolated manifolds