A toolbox for surfacing health equity harms and biases in large language models
Stephen R. Pfohl
Heather Cole-Lewis
Rory A Sayres
Darlene Neal
Mercy Nyamewaa Asiedu
Awa Dieng
Nenad Tomasev
Qazi Mamunur Rashid
Shekoofeh Azizi
Liam G. McCoy
L. A. Celi
Yun Liu
Mike Schaekermann
Alanna Walton
Alicia Parrish
Chirag Nagpal
Preeti Singh
Akeiylah Dewitt
P. A. Mansfield … (voir 10 de plus)
Sushant Prakash
Katherine Heller
Alan Karthikesalingam
Christopher Semturs
Joelle Barral
Greg C. Corrado
Yossi Matias
Jamila Smith-Loud
Ivor Horn
Karan Singhal
Reinforcement learning for freight booking control problems
Justin Dumouchelle
Andrea Lodi
Normalizing Spinal Cord Compression Morphometric Measures: Application in Degenerative Cervical Myelopathy
Sandrine Bédard
Jan Valošek
Maryam Seif PhD
Armin Curt PhD
Simon Schading Md
M.Sc
Nikolai Pfender
Patrick Freund Md
Markus Hupp MD PhD
Julien Cohen-adad Md
Objective: Automatic and robust characterization of spinal cord shape from MRI images is relevant to assess the severity of spinal cord comp… (voir plus)ression in degenerative cervical myelopathy (DCM) and to guide therapeutic strategy. Despite its popularity, the maximum spinal cord compression (MSCC) index has practical limitations to objectively assess the severity of cord compression. Firstly, it is computed by normalizing the anteroposterior cord diameter by that above and below the level of compression, but it does not account for the fact that the spinal cord itself varies in size along the superior-inferior axis, making this MSCC sensitive to the level of compression. Secondly, spinal cord shape varies across individuals, making MSCC also sensitive to the size and shape of every individual. Thirdly, MSCC is typically computed by the expert-rater on a single sagittal slice, which is time-consuming and prone to inter-rater variability. In this study, we propose a fully automatic pipeline to compute MSCC. Methods: We extended the traditional MSCC (based on the anteroposterior diameter) to other shape metrics (transverse diameter, area, eccentricity, and solidity), and proposed a normalization strategy using a database of healthy adults (n=203) to address the variability of the spinal cord anatomy between individuals. We validated the proposed method in a cohort of DCM patients (n=120) with manually derived morphometric measures and predicted the therapeutic decision (operative/conservative) using a stepwise binary logistic regression including demographics, clinical scores, and electrophysiological assessment. Results: The automatic and normalized MSCC measures significantly correlated with clinical scores and predicted the therapeutic decision with higher accuracy than the manual MSCC. Results show that the sensory dysfunction of the upper extremities (mJOA subscore), the presence of myelopathy and the proposed MRI-based normalized morphometric measures were significant predictors of the therapeutic decision. The model yielded an area under the curve of the receiver operating characteristic of 80%. Conclusion: The study introduced an automatic method for computation of normalized MSCC measures of cord compression from MRI scans, which is an important step towards better informed therapeutic decisions in DCM patients. The method is open-source and available in the Spinal Cord Toolbox v6.0.
Safety Cases: How to Justify the Safety of Advanced AI Systems
Joshua Clymer
Nick Gabrieli
Thomas Larsen
As AI systems become more advanced, companies and regulators will make difficult decisions about whether it is safe to train and deploy them… (voir plus). To prepare for these decisions, we investigate how developers could make a 'safety case,' which is a structured rationale that AI systems are unlikely to cause a catastrophe. We propose a framework for organizing a safety case and discuss four categories of arguments to justify safety: total inability to cause a catastrophe, sufficiently strong control measures, trustworthiness despite capability to cause harm, and -- if AI systems become much more powerful -- deference to credible AI advisors. We evaluate concrete examples of arguments in each category and outline how arguments could be combined to justify that AI systems are safe to deploy.
Safety Cases: How to Justify the Safety of Advanced AI Systems
Joshua Clymer
Nick Gabrieli
Thomas Larsen
As AI systems become more advanced, companies and regulators will make difficult decisions about whether it is safe to train and deploy them… (voir plus). To prepare for these decisions, we investigate how developers could make a 'safety case,' which is a structured rationale that AI systems are unlikely to cause a catastrophe. We propose a framework for organizing a safety case and discuss four categories of arguments to justify safety: total inability to cause a catastrophe, sufficiently strong control measures, trustworthiness despite capability to cause harm, and -- if AI systems become much more powerful -- deference to credible AI advisors. We evaluate concrete examples of arguments in each category and outline how arguments could be combined to justify that AI systems are safe to deploy.
On the Identifiability of Quantized Factors
Vitória Barin Pacela
Kartik Ahuja
Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the… (voir plus) theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.
Aleatoric and epistemic uncertainty extraction of patient-specific deep learning-based dose predictions in LDR prostate brachytherapy
Francisco Berumen
Samuel Ouellet
Luc Beaulieu
Analyzing Data Augmentation for Medical Images: A Case Study in Ultrasound Images
Adam Tupper
Data augmentation is one of the most effective techniques to improve the generalization performance of deep neural networks. Yet, despite of… (voir plus)ten facing limited data availability in medical image analysis, it is frequently underutilized. This appears to be due to a gap in our collective understanding of the efficacy of different augmentation techniques across medical imaging tasks and modalities. One domain where this is especially true is breast ultrasound images. This work addresses this issue by analyzing the effectiveness of different augmentation techniques for the classification of breast lesions in ultrasound images. We assess the generalizability of our findings across several datasets, demonstrate that certain augmentations are far more effective than others, and show that their usage leads to significant performance gains.
One-Shot Learning for MIPs with SOS1 Constraints
Charly Robinson La Rocca
Jean-François Cordeau
Bayesian Spectral Graph Denoising with Smoothness Prior
Samuel Leone
Xingzhi Sun
Michael Perlmutter
Here we consider the problem of denoising features associated to complex data, modeled as signals on a graph, via a smoothness prior. This i… (voir plus)s motivated in part by settings such as single-cell RNA where the data is very high-dimensional, but its structure can be captured via an affinity graph. This allows us to utilize ideas from graph signal processing. In particular, we present algorithms for the cases where the signal is perturbed by Gaussian noise, dropout, and uniformly distributed noise. The signals are assumed to follow a prior distribution defined in the frequency domain which favors signals which are smooth across the edges of the graph. By pairing this prior distribution with our three models of noise generation, we propose Maximum A Posteriori (M.A.P.) estimates of the true signal in the presence of noisy data and provide algorithms for computing the M.A.P. Finally, we demonstrate the algorithms’ ability to effectively restore signals from white noise on image data and from severe dropout in single-cell RNA sequence data.
Bugs in Large Language Models Generated Code: An Empirical Study
Florian Tambon
Arghavan Moradi Dakhel
Amin Nikanjam
Michel C. Desmarais
Giuliano Antoniol
Large Language Models (LLMs) for code have gained significant attention recently. They can generate code in different programming languages … (voir plus)based on provided prompts, fulfilling a long-lasting dream in Software Engineering (SE), i.e., automatic code generation. Similar to human-written code, LLM-generated code is prone to bugs, and these bugs have not yet been thoroughly examined by the community. Given the increasing adoption of LLM-based code generation tools (e.g., GitHub Copilot) in SE activities, it is critical to understand the characteristics of bugs contained in code generated by LLMs. This paper examines a sample of 333 bugs collected from code generated using three leading LLMs (i.e., CodeGen, PanGu-Coder, and Codex) and identifies the following 10 distinctive bug patterns: Misinterpretations, Syntax Error, Silly Mistake, Prompt-biased code, Missing Corner Case, Wrong Input Type, Hallucinated Object, Wrong Attribute, Incomplete Generation, and Non-Prompted Consideration. The bug patterns are presented in the form of a taxonomy. The identified bug patterns are validated using an online survey with 34 LLM practitioners and researchers. The surveyed participants generally asserted the significance and prevalence of the bug patterns. Researchers and practitioners can leverage these findings to develop effective quality assurance techniques for LLM-generated code. This study sheds light on the distinctive characteristics of LLM-generated code.
Online Bayesian optimization of vagus nerve stimulation.
Lorenz Wernisch
Tristan Edwards
Antonin Berthon
Olivier Tessier-Lariviere
Elvijs Sarkans
Myrta Stoukidi
Pascal Fortier-Poisson
Max Pinkney
Michael Thornton
Catherine Hanley
Susannah Lee
Joel Jennings
Ben Appleton
Philip Garsed
Bret Patterson
Buttinger Will
Samuel Gonshaw
Matjaž Jakopec
Sudhakaran Shunmugam
Jorin Mamen … (voir 4 de plus)
Aleksi Tukiainen
Oliver Armitage
Emil Hewage
OBJECTIVE In bioelectronic medicine, neuromodulation therapies induce neural signals to the brain or organs, modifying their function. Stimu… (voir plus)lation devices capable of triggering exogenous neural signals using electrical waveforms require a complex and multi-dimensional parameter space to control such waveforms. Determining the best combination of parameters (waveform optimization or dosing) for treating a particular patient's illness is therefore challenging. Comprehensive parameter searching for an optimal stimulation effect is often infeasible in a clinical setting due to the size of the parameter space. Restricting this space, however, may lead to suboptimal therapeutic results, reduced responder rates, and adverse effects. Approach. As an alternative to a full parameter search, we present a flexible machine learning, data acquisition, and processing framework for optimizing neural stimulation parameters, requiring as few steps as possible using Bayesian optimization. This optimization builds a model of the neural and physiological responses to stimulations, enabling it to optimize stimulation parameters and provide estimates of the accuracy of the response model. The vagus nerve innervates, among other thoracic and visceral organs, the heart, thus controlling heart rate, making it an ideal candidate for demonstrating the effectiveness of our approach. Main results. The efficacy of our optimization approach was first evaluated on simulated neural responses, then applied to vagus nerve stimulation intraoperatively in porcine subjects. Optimization converged quickly on parameters achieving target heart rates and optimizing neural B-fiber activations despite high intersubject variability. Significance. An optimized stimulation waveform was achieved in real time with far fewer stimulations than required by alternative optimization strategies, thus minimizing exposure to side effects. Uncertainty estimates helped avoiding stimulations outside a safe range. Our approach shows that a complex set of neural stimulation parameters can be optimized in real-time for a patient to achieve a personalized precision dosing. .