Publications

XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Joao Monteiro
Étienne Marcotte
Pierre-Andre Noel
Valentina Zantedeschi
Christopher Pal
In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference informati… (see more)on. Just-in-time processing of a context is inefficient due to the quadratic cost of self-attention operations, and caching is desirable. However, caching transformer states can easily require almost as much space as the model parameters. When the right context isn't known in advance, caching ICL can be challenging. This work addresses these limitations by introducing models that, inspired by the encoder-decoder architecture, use cross-attention to condition generation on reference text without the prompt. More precisely, we leverage pre-trained decoder-only models and only train a small number of added layers. We use Question-Answering (QA) as a testbed to evaluate the ability of our models to perform conditional generation and observe that they outperform ICL, are comparable to fine-tuned prompted LLMs, and drastically reduce the space footprint relative to standard KV caching by two orders of magnitude.
Penalties and Rewards for Fair Learning in Paired Kidney Exchange Programs
Alison Caulfield
Yi Lin
Adrian Vetta
A kidney exchange program, also called a kidney paired donation program, can be viewed as a repeated, dynamic trading and allocation mechani… (see more)sm. This suggests that a dynamic algorithm for transplant exchange selection may have superior performance in comparison to the repeated use of a static algorithm. We confirm this hypothesis using a full scale simulation of the Canadian Kidney Paired Donation Program: learning algorithms, that attempt to learn optimal patient-donor weights in advance via dynamic simulations, do lead to improved outcomes. Specifically, our learning algorithms, designed with the objective of fairness (that is, equity in terms of transplant accessibility across cPRA groups), also lead to an increased number of transplants and shorter average waiting times. Indeed, our highest performing learning algorithm improves egalitarian fairness by 10% whilst also increasing the number of transplants by 6% and decreasing waiting times by 24%. However, our main result is much more surprising. We find that the most critical factor in determining the performance of a kidney exchange program is not the judicious assignment of positive weights (rewards) to patient-donor pairs. Rather, the key factor in increasing the number of transplants, decreasing waiting times and improving group fairness is the judicious assignment of a negative weight (penalty) to the small number of non-directed donors in the kidney exchange program.
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
Thomas Krendl Gilbert
Jérémy Scheurer
Javier Rando
Rachel Freedman
Tomasz Korbak
David Lindner
Pedro Freire
Tony Tong Wang
Samuel Marks
Charbel-Raphael Segerie
MICAH CARROLL
Phillip Christoffersen
Mehul Damani
Stewart Slocum
Usman Anwar
Anand Siththaranjan … (see 12 more)
Max Nadeau
Eric J Michaud
Jacob Pfau
Dmitrii Krasheninnikov
Xin Chen
Lauro Langosco
Peter Hase
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
Use of Artificial Intelligence in the Identification and Management of Frailty: A Scoping Review Protocol
Sathya Karunananthan
Arya Rahgozar
Ramtin Hakimjavadi
Hui Yan
Kunal A Dalsania
Howard Bergman
Bishwajit Ghose
Jim LaPlante
Tess McCutcheon
Daniel I McIsaac
S. A. Rahimi
Nadia Sourial
Manpreet Thandi
Sabrina T Wong
Clare Liddy
Behavioural pseudometrics for continuous-time diffusions
Cortical neuroprosthesis-mediated functional ipsilateral control of locomotion in rats with spinal cord hemisection
Elena Massai
Isley De Jesus
Roxanne Drainville
Marina Martinez
Abstract Control of voluntary limb movement is predominantly attributed to the contralateral motor cortex. However, increasi… (see more)ng evidence suggests the involvement of ipsilateral cortical networks in this process, especially in motor tasks requiring bilateral coordination, such as locomotion. In this study, we combined a unilateral thoracic spinal cord injury (SCI) with a cortical neuroprosthetic approach to investigate the functional role of the ipsilateral motor cortex in rat movement through spared contralesional pathways. Our findings reveal that in all SCI rats, stimulation of the ipsilesional motor cortex promoted a bilateral synergy. This synergy involved the elevation of the contralateral foot along with ipsilateral hindlimb extension. Additionally, in two out of seven animals, stimulation of a sub-region of the hindlimb motor cortex modulated ipsilateral hindlimb flexion. Importantly, ipsilateral cortical stimulation delivered after SCI immediately alleviated multiple locomotor and postural deficits, and this effect persisted after ablation of the homologous motor cortex. These results provide strong evidence of a causal link between cortical activation and precise ipsilateral control of hindlimb movement. This study has significant implications for the development of future neuroprosthetic technology and our understanding of motor control in the context of spinal cord injury.
Device-Free Human State Estimation using UWB Multi-Static Radios
Saria Al Lahham
Bobak H. Baghi
Pierre-Yves Lajoie
Amal Feriani
Sachini Herath
Steve Liu
We present a human state estimation framework that allows us to estimate the location, and even the activities, of people in an indoor envir… (see more)onment without the requirement that they carry a specific devices with them. To achieve this"device free"localization we use a small number of low-cost Ultra-Wide Band (UWB) sensors distributed across the environment of interest. To achieve high quality estimation from the UWB signals merely reflected of people in the environment, we exploit a deep network that can learn to make inferences. The hardware setup consists of commercial off-the-shelf (COTS) single antenna UWB modules for sensing, paired with Raspberry PI units for computational processing and data transfer. We make use of the channel impulse response (CIR) measurements from the UWB sensors to estimate the human state - comprised of location and activity - in a given area. Additionally, we can also estimate the number of humans that occupy this region of interest. In our approach, first, we pre-process the CIR data which involves meticulous aggregation of measurements and extraction of key statistics. Afterwards, we leverage a convolutional deep neural network to map the CIRs into precise location estimates with sub-30 cm accuracy. Similarly, we achieve accurate human activity recognition and occupancy counting results. We show that we can quickly fine-tune our model for new out-of-distribution users, a process that requires only a few minutes of data and a few epochs of training. Our results show that UWB is a promising solution for adaptable smart-home localization and activity recognition problems.
Fairness-Aware Structured Pruning in Transformers
Samira Shabanian
Ioana Baldini
A. Chandar
When Nash Meets Stackelberg
Gabriele Dragotto
Felipe Feijoo
Andrea Lodi
Sriram Sankaranarayanan
CODA: an open-source platform for federated analysis and machine learning on distributed healthcare data
Louis Mullie
Jonathan Afilalo
Patrick Archambault
Rima Bouchakri
Kip Brown
David L. Buckeridge
Yiorgos Alexandros Cavayas
Alexis F. Turgeon
Denis Martineau
François Lamontagne
Martine Lebrasseur
Renald Lemieux
Jeffrey Li
Michaël Sauthier
Pascal St-Onge
An Tang
William Witteman
Michael Chassé
Distributed computations facilitate multi-institutional data analysis while avoiding the costs and complexity of data pooling. Existing appr… (see more)oaches lack crucial features, such as built-in medical standards and terminologies, no-code data visualizations, explicit disclosure control mechanisms, and support for basic statistical computations, in addition to gradient-based optimization capabilities. We describe the development of the Collaborative Data Analysis (CODA) platform, and the design choices undertaken to address the key needs identified during our survey of stakeholders. We use a public dataset (MIMIC-IV) to demonstrate end-to-end multi-modal FL using CODA. We assessed the technical feasibility of deploying the CODA platform at 9 hospitals in Canada, describe implementation challenges, and evaluate its scalability on large patient populations. The CODA platform was designed, developed, and deployed between January 2020 and January 2023. Software code, documentation, and technical documents were released under an open-source license. Multi-modal federated averaging is illustrated using the MIMIC-IV and MIMIC-CXR datasets. To date, 8 out of the 9 participating sites have successfully deployed the platform, with a total enrolment of >1M patients. Mapping data from legacy systems to FHIR was the biggest barrier to implementation. The CODA platform was developed and successfully deployed in a public healthcare setting in Canada, with heterogeneous information technology systems and capabilities. Ongoing efforts will use the platform to develop and prospectively validate models for risk assessment, proactive monitoring, and resource usage. Further work will also make tools available to facilitate migration from legacy formats to FHIR and DICOM.
A landmark environmental law looks ahead
Robert L. Fischman
J. B. Ruhl
Brenna R. Forester
Tanya M. Lama
Marty Kardos
Grethel Aguilar Rojas
Nicholas A. Robinson
Patrick D. Shirey
Gary A. Lamberti
Amy W. Ando
Stephen Palumbi
Michael Wara
Mark W. Schwartz
Matthew A. Williamson
Tanya Berger-Wolf
Sara Beery
Justin Kitzes
David Thau
Devis Tuia … (see 8 more)
Daniel Rubenstein
Caleb R. Hickman
Julie Thorstenson
Gregory E. Kaebnick
James P. Collins
Athmeya Jayaram
Thomas Deleuil
Ying Zhao
In late December 1973, the United States enacted what some would come to call “the pitbull of environmental laws.” In the 50 years since… (see more), the formidable regulatory teeth of the Endangered Species Act (ESA) have been credited with considerable successes, obliging agencies to draw upon the best available science to protect species and habitats. Yet human pressures continue to push the planet toward extinctions on a massive scale. With that prospect looming, and with scientific understanding ever changing, Science invited experts to discuss how the ESA has evolved and what its future might hold. —Brad Wible
Extended Lyman-alpha emission towards the SPT2349-56 protocluster at $z=4.3$
Yordanka Apostolovski
Manuel Aravena
Timo Anguita
Matthieu Béthermin
James R. Burgoyne
Scott Chapman
C. Breuck
Anthony R Gonzalez
Max Gronke
Lucia Guaita
Ryley Hill
Sreevani Jarugula
E. Johnston
M. Malkan
Desika Narayanan
Cassie Reuter
Manuel Solimano
Justin Spilker
Nikolaus Sulzenauer … (see 3 more)
Joaquin Vieira
David Vizgan
Axel Weiß
Deep spectroscopic surveys with the Atacama Large Millimeter/submillimeter Array (ALMA) have revealed that some of the brightest infrared so… (see more)urces in the sky correspond to concentrations of submillimeter galaxies (SMGs) at high redshift. Among these, the SPT2349-56 protocluster system is amongst the most extreme examples given its high source density and integrated star formation rate. We conducted a deep Lyman-alpha line emission survey around SPT2349-56 using the Multi-Unit Spectroscopic Explorer (MUSE) at the Very Large Telescope (VLT) in order to characterize this uniquely dense environment. Taking advantage of the deep three-dimensional nature of this survey, we performed a sensitive search for Lyman-alpha emitters (LAEs) toward the core and northern extension of the protocluster, which correspond to the brightest infrared regions in this field. Using a smoothed narrowband image extracted from the MUSE datacube around the protocluster redshift, we searched for possible extended structures. We identify only three LAEs at