Publications

XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

Joao Monteiro

Étienne Marcotte

Pierre-Andre Noel

Valentina Zantedeschi

David Vázquez

Nicolas Chapados

Christopher Pal

Perouz Taslakian

In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference informati… (voir plus)on. Just-in-time processing of a context is inefficient due to the quadratic cost of self-attention operations, and caching is desirable. However, caching transformer states can easily require almost as much space as the model parameters. When the right context isn't known in advance, caching ICL can be challenging. This work addresses these limitations by introducing models that, inspired by the encoder-decoder architecture, use cross-attention to condition generation on reference text without the prompt. More precisely, we leverage pre-trained decoder-only models and only train a small number of added layers. We use Question-Answering (QA) as a testbed to evaluate the ability of our models to perform conditional generation and observe that they outperform ICL, are comparable to fine-tuned prompted LLMs, and drastically reduce the space footprint relative to standard KV caching by two orders of magnitude.

2023-12-31

EMNLP (Findings) (publié)

doi.org

arxiv.org

Penalties and Rewards for Fair Learning in Paired Kidney Exchange Programs

Margarida Carvalho

Alison Caulfield

Yi Lin

Adrian Vetta

A kidney exchange program, also called a kidney paired donation program, can be viewed as a repeated, dynamic trading and allocation mechani… (voir plus)sm. This suggests that a dynamic algorithm for transplant exchange selection may have superior performance in comparison to the repeated use of a static algorithm. We confirm this hypothesis using a full scale simulation of the Canadian Kidney Paired Donation Program: learning algorithms, that attempt to learn optimal patient-donor weights in advance via dynamic simulations, do lead to improved outcomes. Specifically, our learning algorithms, designed with the objective of fairness (that is, equity in terms of transplant accessibility across cPRA groups), also lead to an increased number of transplants and shorter average waiting times. Indeed, our highest performing learning algorithm improves egalitarian fairness by 10% whilst also increasing the number of transplants by 6% and decreasing waiting times by 24%. However, our main result is much more surprising. We find that the most critical factor in determining the performance of a kidney exchange program is not the judicious assignment of positive weights (rewards) to patient-donor pairs. Rather, the key factor in increasing the number of transplants, decreasing waiting times and improving group fairness is the judicious assignment of a negative weight (penalty) to the small number of non-directed donors in the kidney exchange program.

2023-12-30

Web and Internet Economics (publié)

doi.org

arxiv.org

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Stephen Casper

Xander Davies

Claudia Shi

Thomas Krendl Gilbert

Jérémy Scheurer

Javier Rando

Rachel Freedman

Tomasz Korbak

David Lindner

Pedro Freire

Tony Tong Wang

Samuel Marks

Charbel-Raphael Segerie

MICAH CARROLL

Andi Peng

Phillip Christoffersen

Mehul Damani

Stewart Slocum

Usman Anwar

Anand Siththaranjan … (voir 12 de plus)

Max Nadeau

Eric J Michaud

Jacob Pfau

Dmitrii Krasheninnikov

Xin Chen

Lauro Langosco

Peter Hase

Erdem Biyik

Anca Dragan

David M. Krueger

Dorsa Sadigh

Dylan Hadfield-Menell

2023-12-29

TMLR (accepté)

doi.org

openreview.net

Use of Artificial Intelligence in the Identification and Management of Frailty: A Scoping Review Protocol

Sathya Karunananthan

Arya Rahgozar

Ramtin Hakimjavadi

Hui Yan

Kunal A Dalsania

Howard Bergman

Bishwajit Ghose

Jim LaPlante

Tess McCutcheon

Daniel I McIsaac

S. A. Rahimi

Nadia Sourial

Manpreet Thandi

Sabrina T Wong

Clare Liddy

2023-12-27

BMJ Open (publié)

doi.org

Behavioural pseudometrics for continuous-time diffusions

Linan Chen

Florence Clerc

Prakash Panangaden

2023-12-26

ArXiv (prépublication)

doi.org

arxiv.org

Cortical neuroprosthesis-mediated functional ipsilateral control of locomotion in rats with spinal cord hemisection

Elena Massai

Marco Bonizzato

Isley De Jesus

Roxanne Drainville

Marina Martinez

Abstract Control of voluntary limb movement is predominantly attributed to the contralateral motor cortex. However, increasi… (voir plus)ng evidence suggests the involvement of ipsilateral cortical networks in this process, especially in motor tasks requiring bilateral coordination, such as locomotion. In this study, we combined a unilateral thoracic spinal cord injury (SCI) with a cortical neuroprosthetic approach to investigate the functional role of the ipsilateral motor cortex in rat movement through spared contralesional pathways. Our findings reveal that in all SCI rats, stimulation of the ipsilesional motor cortex promoted a bilateral synergy. This synergy involved the elevation of the contralateral foot along with ipsilateral hindlimb extension. Additionally, in two out of seven animals, stimulation of a sub-region of the hindlimb motor cortex modulated ipsilateral hindlimb flexion. Importantly, ipsilateral cortical stimulation delivered after SCI immediately alleviated multiple locomotor and postural deficits, and this effect persisted after ablation of the homologous motor cortex. These results provide strong evidence of a causal link between cortical activation and precise ipsilateral control of hindlimb movement. This study has significant implications for the development of future neuroprosthetic technology and our understanding of motor control in the context of spinal cord injury.

2023-12-26

eLife (publié)

doi.org

Device-Free Human State Estimation using UWB Multi-Static Radios

Saria Al Lahham

Bobak H. Baghi

Pierre-Yves Lajoie

Amal Feriani

Sachini Herath

Steve Liu

Gregory Dudek

We present a human state estimation framework that allows us to estimate the location, and even the activities, of people in an indoor envir… (voir plus)onment without the requirement that they carry a specific devices with them. To achieve this"device free"localization we use a small number of low-cost Ultra-Wide Band (UWB) sensors distributed across the environment of interest. To achieve high quality estimation from the UWB signals merely reflected of people in the environment, we exploit a deep network that can learn to make inferences. The hardware setup consists of commercial off-the-shelf (COTS) single antenna UWB modules for sensing, paired with Raspberry PI units for computational processing and data transfer. We make use of the channel impulse response (CIR) measurements from the UWB sensors to estimate the human state - comprised of location and activity - in a given area. Additionally, we can also estimate the number of humans that occupy this region of interest. In our approach, first, we pre-process the CIR data which involves meticulous aggregation of measurements and extraction of key statistics. Afterwards, we leverage a convolutional deep neural network to map the CIRs into precise location estimates with sub-30 cm accuracy. Similarly, we achieve accurate human activity recognition and occupancy counting results. We show that we can quickly fine-tune our model for new out-of-distribution users, a process that requires only a few minutes of data and a few epochs of training. Our results show that UWB is a promising solution for adaptable smart-home localization and activity recognition problems.

2023-12-25

ArXiv (prépublication)

doi.org

arxiv.org

Fairness-Aware Structured Pruning in Transformers

Abdelrahman Zayed

Goncalo Mordido

Samira Shabanian

Ioana Baldini

A. Chandar

2023-12-23

ArXiv (prépublication)

doi.org

arxiv.org

When Nash Meets Stackelberg

Margarida Carvalho

Gabriele Dragotto

Felipe Feijoo

Andrea Lodi

Sriram Sankaranarayanan

2023-12-21

Management Science (publié)

doi.org

arxiv.org

CODA: an open-source platform for federated analysis and machine learning on distributed healthcare data

Louis Mullie

Jonathan Afilalo

Patrick Archambault

Rima Bouchakri

Kip Brown

David L. Buckeridge

Yiorgos Alexandros Cavayas

Alexis F. Turgeon

Denis Martineau

François Lamontagne

Martine Lebrasseur

Renald Lemieux

Jeffrey Li

Michaël Sauthier

Pascal St-Onge

An Tang

William Witteman

Michael Chassé

Distributed computations facilitate multi-institutional data analysis while avoiding the costs and complexity of data pooling. Existing appr… (voir plus)oaches lack crucial features, such as built-in medical standards and terminologies, no-code data visualizations, explicit disclosure control mechanisms, and support for basic statistical computations, in addition to gradient-based optimization capabilities. We describe the development of the Collaborative Data Analysis (CODA) platform, and the design choices undertaken to address the key needs identified during our survey of stakeholders. We use a public dataset (MIMIC-IV) to demonstrate end-to-end multi-modal FL using CODA. We assessed the technical feasibility of deploying the CODA platform at 9 hospitals in Canada, describe implementation challenges, and evaluate its scalability on large patient populations. The CODA platform was designed, developed, and deployed between January 2020 and January 2023. Software code, documentation, and technical documents were released under an open-source license. Multi-modal federated averaging is illustrated using the MIMIC-IV and MIMIC-CXR datasets. To date, 8 out of the 9 participating sites have successfully deployed the platform, with a total enrolment of >1M patients. Mapping data from legacy systems to FHIR was the biggest barrier to implementation. The CODA platform was developed and successfully deployed in a public healthcare setting in Canada, with heterogeneous information technology systems and capabilities. Ongoing efforts will use the platform to develop and prospectively validate models for risk assessment, proactive monitoring, and resource usage. Further work will also make tools available to facilitate migration from legacy formats to FHIR and DICOM.

2023-12-20

J. Am. Medical Informatics Assoc. (publié)

doi.org

A landmark environmental law looks ahead

Robert L. Fischman

J. B. Ruhl

Brenna R. Forester

Tanya M. Lama

Marty Kardos

Grethel Aguilar Rojas

Nicholas A. Robinson

Patrick D. Shirey

Gary A. Lamberti

Amy W. Ando

Stephen Palumbi

Michael Wara

Mark W. Schwartz

Matthew A. Williamson

Tanya Berger-Wolf

Sara Beery

David Rolnick

Justin Kitzes

David Thau

Devis Tuia … (voir 8 de plus)

Daniel Rubenstein

Caleb R. Hickman

Julie Thorstenson

Gregory E. Kaebnick

James P. Collins

Athmeya Jayaram

Thomas Deleuil

Ying Zhao

In late December 1973, the United States enacted what some would come to call “the pitbull of environmental laws.” In the 50 years since… (voir plus), the formidable regulatory teeth of the Endangered Species Act (ESA) have been credited with considerable successes, obliging agencies to draw upon the best available science to protect species and habitats. Yet human pressures continue to push the planet toward extinctions on a massive scale. With that prospect looming, and with scientific understanding ever changing, Science invited experts to discuss how the ESA has evolved and what its future might hold. —Brad Wible

2023-12-20

Science (publié)

doi.org

Extended Lyman-alpha emission towards the SPT2349-56 protocluster at $z=4.3$

Yordanka Apostolovski

Manuel Aravena

Timo Anguita

Matthieu Béthermin

James R. Burgoyne

Scott Chapman

C. Breuck

Anthony R Gonzalez

Max Gronke

Lucia Guaita

Yashar Hezaveh

Ryley Hill

Sreevani Jarugula

E. Johnston

M. Malkan

Desika Narayanan

Cassie Reuter

Manuel Solimano

Justin Spilker

Nikolaus Sulzenauer … (voir 3 de plus)

Joaquin Vieira

David Vizgan

Axel Weiß

Deep spectroscopic surveys with the Atacama Large Millimeter/submillimeter Array (ALMA) have revealed that some of the brightest infrared so… (voir plus)urces in the sky correspond to concentrations of submillimeter galaxies (SMGs) at high redshift. Among these, the SPT2349-56 protocluster system is amongst the most extreme examples given its high source density and integrated star formation rate. We conducted a deep Lyman-alpha line emission survey around SPT2349-56 using the Multi-Unit Spectroscopic Explorer (MUSE) at the Very Large Telescope (VLT) in order to characterize this uniquely dense environment. Taking advantage of the deep three-dimensional nature of this survey, we performed a sensitive search for Lyman-alpha emitters (LAEs) toward the core and northern extension of the protocluster, which correspond to the brightest infrared regions in this field. Using a smoothed narrowband image extracted from the MUSE datacube around the protocluster redshift, we searched for possible extended structures. We identify only three LAEs at

2023-12-19

Astronomy & Astrophysics (publié)

doi.org

arxiv.org

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Publications

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Mots-clés populaires:

Publications