Publications

A toolbox for surfacing health equity harms and biases in large language models

Stephen R. Pfohl

Heather Cole-Lewis

Rory A Sayres

Darlene Neal

Mercy Nyamewaa Asiedu

Awa Dieng

Nenad Tomasev

Qazi Mamunur Rashid

Shekoofeh Azizi

Negar Rostamzadeh

Liam G. McCoy

L. A. Celi

Yun Liu

Mike Schaekermann

Alanna Walton

Alicia Parrish

Chirag Nagpal

Preeti Singh

Akeiylah Dewitt

P. A. Mansfield … (see 10 more)

Sushant Prakash

Katherine Heller

Alan Karthikesalingam

Christopher Semturs

Joelle Barral

Greg C. Corrado

Yossi Matias

Jamila Smith-Loud

Ivor Horn

Karan Singhal

2024-03-18

ArXiv (preprint)

doi.org

arxiv.org

Reinforcement learning for freight booking control problems

Justin Dumouchelle

Emma Frejinger

Andrea Lodi

2024-03-16

Journal of Revenue and Pricing Management (published)

doi.org

arxiv.org

Normalizing Spinal Cord Compression Morphometric Measures: Application in Degenerative Cervical Myelopathy

Sandrine Bédard

Jan Valosek

Maryam Seif PhD

Armin Curt PhD

Simon Schading Md

M.Sc

Nikolai Pfender

Patrick Freund Md

Markus Hupp MD PhD

Julien Cohen-adad Md

Objective: Automatic and robust characterization of spinal cord shape from MRI images is relevant to assess the severity of spinal cord comp… (see more)ression in degenerative cervical myelopathy (DCM) and to guide therapeutic strategy. Despite its popularity, the maximum spinal cord compression (MSCC) index has practical limitations to objectively assess the severity of cord compression. Firstly, it is computed by normalizing the anteroposterior cord diameter by that above and below the level of compression, but it does not account for the fact that the spinal cord itself varies in size along the superior-inferior axis, making this MSCC sensitive to the level of compression. Secondly, spinal cord shape varies across individuals, making MSCC also sensitive to the size and shape of every individual. Thirdly, MSCC is typically computed by the expert-rater on a single sagittal slice, which is time-consuming and prone to inter-rater variability. In this study, we propose a fully automatic pipeline to compute MSCC. Methods: We extended the traditional MSCC (based on the anteroposterior diameter) to other shape metrics (transverse diameter, area, eccentricity, and solidity), and proposed a normalization strategy using a database of healthy adults (n=203) to address the variability of the spinal cord anatomy between individuals. We validated the proposed method in a cohort of DCM patients (n=120) with manually derived morphometric measures and predicted the therapeutic decision (operative/conservative) using a stepwise binary logistic regression including demographics, clinical scores, and electrophysiological assessment. Results: The automatic and normalized MSCC measures significantly correlated with clinical scores and predicted the therapeutic decision with higher accuracy than the manual MSCC. Results show that the sensory dysfunction of the upper extremities (mJOA subscore), the presence of myelopathy and the proposed MRI-based normalized morphometric measures were significant predictors of the therapeutic decision. The model yielded an area under the curve of the receiver operating characteristic of 80%. Conclusion: The study introduced an automatic method for computation of normalized MSCC measures of cord compression from MRI scans, which is an important step towards better informed therapeutic decisions in DCM patients. The method is open-source and available in the Spinal Cord Toolbox v6.0.

2024-03-15

medRxiv (preprint)

doi.org

Safety Cases: How to Justify the Safety of Advanced AI Systems

Joshua Clymer

Nick Gabrieli

David Scott Krueger

Thomas Larsen

As AI systems become more advanced, companies and regulators will make difficult decisions about whether it is safe to train and deploy them… (see more). To prepare for these decisions, we investigate how developers could make a 'safety case,' which is a structured rationale that AI systems are unlikely to cause a catastrophe. We propose a framework for organizing a safety case and discuss four categories of arguments to justify safety: total inability to cause a catastrophe, sufficiently strong control measures, trustworthiness despite capability to cause harm, and -- if AI systems become much more powerful -- deference to credible AI advisors. We evaluate concrete examples of arguments in each category and outline how arguments could be combined to justify that AI systems are safe to deploy.

2024-03-15

ArXiv (preprint)

doi.org

arxiv.org

Safety Cases: How to Justify the Safety of Advanced AI Systems

Joshua Clymer

Nick Gabrieli

David Scott Krueger

Thomas Larsen

As AI systems become more advanced, companies and regulators will make difficult decisions about whether it is safe to train and deploy them… (see more). To prepare for these decisions, we investigate how developers could make a 'safety case,' which is a structured rationale that AI systems are unlikely to cause a catastrophe. We propose a framework for organizing a safety case and discuss four categories of arguments to justify safety: total inability to cause a catastrophe, sufficiently strong control measures, trustworthiness despite capability to cause harm, and -- if AI systems become much more powerful -- deference to credible AI advisors. We evaluate concrete examples of arguments in each category and outline how arguments could be combined to justify that AI systems are safe to deploy.

2024-03-15

ArXiv (preprint)

doi.org

arxiv.org

On the Identifiability of Quantized Factors

Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the… (see more) theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.

2024-03-15

Proceedings of the Third Conference on Causal Learning and Reasoning (published)

proceedings.mlr.press

arxiv.org

Aleatoric and epistemic uncertainty extraction of patient-specific deep learning-based dose predictions in LDR prostate brachytherapy

Francisco Berumen

Samuel Ouellet

Shirin A. Enger

Luc Beaulieu

2024-03-14

Physics in Medicine and Biology (published)

doi.org

Analyzing Data Augmentation for Medical Images: A Case Study in Ultrasound Images

Adam Tupper

Christian Gagné

Data augmentation is one of the most effective techniques to improve the generalization performance of deep neural networks. Yet, despite of… (see more)ten facing limited data availability in medical image analysis, it is frequently underutilized. This appears to be due to a gap in our collective understanding of the efficacy of different augmentation techniques across medical imaging tasks and modalities. One domain where this is especially true is breast ultrasound images. This work addresses this issue by analyzing the effectiveness of different augmentation techniques for the classification of breast lesions in ultrasound images. We assess the generalizability of our findings across several datasets, demonstrate that certain augmentations are far more effective than others, and show that their usage leads to significant performance gains.

2024-03-14

ArXiv (preprint)

doi.org

arxiv.org

One-Shot Learning for MIPs with SOS1 Constraints

Charly Robinson La Rocca

Jean-François Cordeau

Emma Frejinger

2024-03-14

ArXiv (preprint)

doi.org

arxiv.org

Bayesian Spectral Graph Denoising with Smoothness Prior

Samuel Leone

Xingzhi Sun

Michael Perlmutter

Smita Krishnaswamy

Here we consider the problem of denoising features associated to complex data, modeled as signals on a graph, via a smoothness prior. This i… (see more)s motivated in part by settings such as single-cell RNA where the data is very high-dimensional, but its structure can be captured via an affinity graph. This allows us to utilize ideas from graph signal processing. In particular, we present algorithms for the cases where the signal is perturbed by Gaussian noise, dropout, and uniformly distributed noise. The signals are assumed to follow a prior distribution defined in the frequency domain which favors signals which are smooth across the edges of the graph. By pairing this prior distribution with our three models of noise generation, we propose Maximum A Posteriori (M.A.P.) estimates of the true signal in the presence of noisy data and provide algorithms for computing the M.A.P. Finally, we demonstrate the algorithms’ ability to effectively restore signals from white noise on image data and from severe dropout in single-cell RNA sequence data.

2024-03-13

Annual Conference on Information Sciences and Systems (published)

doi.org

arxiv.org

Bugs in Large Language Models Generated Code: An Empirical Study

Florian Tambon

Arghavan Moradi Dakhel

Amin Nikanjam

Foutse Khomh

Michel C. Desmarais

Giuliano Antoniol

Large Language Models (LLMs) for code have gained significant attention recently. They can generate code in different programming languages … (see more)based on provided prompts, fulfilling a long-lasting dream in Software Engineering (SE), i.e., automatic code generation. Similar to human-written code, LLM-generated code is prone to bugs, and these bugs have not yet been thoroughly examined by the community. Given the increasing adoption of LLM-based code generation tools (e.g., GitHub Copilot) in SE activities, it is critical to understand the characteristics of bugs contained in code generated by LLMs. This paper examines a sample of 333 bugs collected from code generated using three leading LLMs (i.e., CodeGen, PanGu-Coder, and Codex) and identifies the following 10 distinctive bug patterns: Misinterpretations, Syntax Error, Silly Mistake, Prompt-biased code, Missing Corner Case, Wrong Input Type, Hallucinated Object, Wrong Attribute, Incomplete Generation, and Non-Prompted Consideration. The bug patterns are presented in the form of a taxonomy. The identified bug patterns are validated using an online survey with 34 LLM practitioners and researchers. The surveyed participants generally asserted the significance and prevalence of the bug patterns. Researchers and practitioners can leverage these findings to develop effective quality assurance techniques for LLM-generated code. This study sheds light on the distinctive characteristics of LLM-generated code.

2024-03-13

ArXiv (preprint)

doi.org

arxiv.org

Online Bayesian optimization of vagus nerve stimulation.

Lorenz Wernisch

Tristan Edwards

Antonin Berthon

Olivier Tessier-Larivière

Elvijs Sarkans

Myrta Stoukidi

Pascal Fortier-Poisson

Max Pinkney

Michael Thornton

Catherine Hanley

Susannah Lee

Joel Jennings

Ben Appleton

Philip Garsed

Bret Patterson

Buttinger Will

Samuel Gonshaw

Matjaž Jakopec

Sudhakaran Shunmugam

Jorin Mamen … (see 4 more)

Aleksi Tukiainen

Guillaume Lajoie

Oliver Armitage

Emil Hewage

OBJECTIVE In bioelectronic medicine, neuromodulation therapies induce neural signals to the brain or organs, modifying their function. Stimu… (see more)lation devices capable of triggering exogenous neural signals using electrical waveforms require a complex and multi-dimensional parameter space to control such waveforms. Determining the best combination of parameters (waveform optimization or dosing) for treating a particular patient's illness is therefore challenging. Comprehensive parameter searching for an optimal stimulation effect is often infeasible in a clinical setting due to the size of the parameter space. Restricting this space, however, may lead to suboptimal therapeutic results, reduced responder rates, and adverse effects. Approach. As an alternative to a full parameter search, we present a flexible machine learning, data acquisition, and processing framework for optimizing neural stimulation parameters, requiring as few steps as possible using Bayesian optimization. This optimization builds a model of the neural and physiological responses to stimulations, enabling it to optimize stimulation parameters and provide estimates of the accuracy of the response model. The vagus nerve innervates, among other thoracic and visceral organs, the heart, thus controlling heart rate, making it an ideal candidate for demonstrating the effectiveness of our approach. Main results. The efficacy of our optimization approach was first evaluated on simulated neural responses, then applied to vagus nerve stimulation intraoperatively in porcine subjects. Optimization converged quickly on parameters achieving target heart rates and optimizing neural B-fiber activations despite high intersubject variability. Significance. An optimized stimulation waveform was achieved in real time with far fewer stimulations than required by alternative optimization strategies, thus minimizing exposure to side effects. Uncertainty estimates helped avoiding stimulations outside a safe range. Our approach shows that a complex set of neural stimulation parameters can be optimized in real-time for a patient to achieve a personalized precision dosing. .

2024-03-13

Journal of Neural Engineering (published)

doi.org

Hackathon | Building safer AI for youth mental health

Indigenous Pathfinders in AI

AI Advantage

Publications

Hackathon | Building safer AI for youth mental health

Indigenous Pathfinders in AI

AI Advantage

Popular keywords:

Publications