Portrait of Joseph Paul Cohen is unavailable

Joseph Paul Cohen

Alumni

Publications

Multi-Domain Balanced Sampling Improves Out-of-Distribution Generalization of Chest X-ray Pathology Prediction Models
Enoch Amoatey Tetteh
Joseph D Viviano
Learning models that generalize under different distribution shifts in medical imaging has been a long-standing research challenge. There ha… (see more)ve been several proposals for efficient and robust visual representation learning among vision research practitioners, especially in the sensitive and critical biomedical domain. In this paper, we propose an idea for out-of-distribution generalization of chest X-ray pathologies that uses a simple balanced batch sampling technique. We observed that balanced sampling between the multiple training datasets improves the performance over baseline models trained without balancing.
Problèmes associés au déploiement des modèles fondés sur l’apprentissage machine en santé
Tianshi Cao
Joseph D Viviano
Michael Fralick
Marzyeh Ghassemi
Muhammad Mamdani
Russell Greiner
Problems in the deployment of machine-learned models in health care
Tianshi Cao
Joseph D Viviano
Chinwei Huang
Michael Fralick
Marzyeh Ghassemi
M. Mamdani
R. Greiner
Exploring the Wasserstein metric for survival analysis
Survival analysis is a type of semi-supervised task where the target output (the survival time) is often right-censored. Utilizing this info… (see more)rmation is a challenge because it is not obvious how to correctly incorporate these censored examples into a model. We study how three categories of loss functions can take advantage of this information: partial likelihood methods, rank methods, and our own classification method based on a Wasserstein metric (WM) and the non-parametric Kaplan Meier (KM) estimate of the probability density to impute the labels of censored examples. The proposed method predicts the probability distribution of an event, letting us compute survival curves and expected times of survival that are easier to interpret than the rank. We also demonstrate that this approach directly optimizes the expected C-index which is the most common evaluation metric for survival models.
Exploring the Wasserstein metric for time-to-event analysis.
Margaux Luck
Heloise Cardinal
Andrea Lodi
Multi-Domain Balanced Sampling Improves Out-of-Generalization of Chest X-ray Pathology Prediction Models
Enoch Amoatey Tetteh
Joseph D Viviano
Learning models that generalize under different distribution shifts in medical imaging has been a long-standing research challenge. There ha… (see more)ve been several proposals for efficient and robust visual representation learning among vision research practitioners, especially in the sensitive and critical biomedical domain. In this paper, we propose an idea for out-of-distribution generalization of chest X-ray pathologies that uses a simple balanced batch sampling technique. We observed that balanced sampling between the multiple training datasets improves the performance over baseline models trained without balancing. Code for this work is available on Github. 1
Saliency is a Possible Red Herring When Diagnosing Poor Generalization
Joseph D Viviano
Becks Simpson
Poor generalization is one symptom of models that learn to predict target variables using spuriously-correlated image features present only … (see more)in the training distribution instead of the true image features that denote a class. It is often thought that this can be diagnosed visually using attribution (aka saliency) maps. We study if this assumption is correct. In some prediction tasks, such as for medical images, one may have some images with masks drawn by a human expert, indicating a region of the image containing relevant information to make the prediction. We study multiple methods that take advantage of such auxiliary labels, by training networks to ignore distracting features which may be found outside of the region of interest. This mask information is only used during training and has an impact on generalization accuracy depending on the severity of the shift between the training and test distributions. Surprisingly, while these methods improve generalization performance in the presence of a covariate shift, there is no strong correspondence between the correction of attribution towards the features a human expert have labelled as important and generalization performance. These results suggest that the root cause of poor generalization may not always be spatially defined, and raise questions about the utility of masks as 'attribution priors' as well as saliency maps for explainable predictions.
DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning
Timo Milbich
Samarth Sinha
Björn Ommer
Predicting COVID-19 Pneumonia Severity on Chest X-ray With Deep Learning
Beiyi Shen
Almas F Abbasi
Hoshmand Kochi Mahsa
Marzyeh Ghassemi
Haifang Li
Tim Q Duong
Introduction The need to streamline patient management for coronavirus disease-19 (COVID-19) has become more pressing than ever. Chest X-ray… (see more)s (CXRs) provide a non-invasive (potentially bedside) tool to monitor the progression of the disease. In this study, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images. Such a tool can gauge the severity of COVID-19 lung infections (and pneumonia in general) that can be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the ICU. Methods Images from a public COVID-19 database were scored retrospectively by three blinded experts in terms of the extent of lung involvement as well as the degree of opacity. A neural network model that was pre-trained on large (non-COVID-19) chest X-ray datasets is used to construct features for COVID-19 images which are predictive for our task. Results This study finds that training a regression model on a subset of the outputs from this pre-trained chest X-ray model predicts our geographic extent score (range 0-8) with 1.14 mean absolute error (MAE) and our lung opacity score (range 0-6) with 0.78 MAE. Conclusions These results indicate that our model’s ability to gauge the severity of COVID-19 lung infections could be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the ICU. To enable follow up work, we make our code, labels, and data available online.
Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition
Icentia11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery
We release the largest public ECG dataset of continuous raw signals for representation learning containing 11 thousand patients and 2 billio… (see more)n labelled beats. Our goal is to enable semi-supervised ECG models to be made as well as to discover unknown subtypes of arrhythmia and anomalous ECG signal events. To this end, we propose an unsupervised representation learning task, evaluated in a semi-supervised fashion. We provide a set of baselines for different feature extractors that can be built upon. Additionally, we perform qualitative evaluations on results from PCA embeddings, where we identify some clustering of known subtypes indicating the potential for representation learning in arrhythmia sub-type discovery.
Torchmeta: A Meta-Learning library for PyTorch
The constant introduction of standardized benchmarks in the literature has helped accelerating the recent advances in meta-learning research… (see more). They offer a way to get a fair comparison between different algorithms, and the wide range of datasets available allows full control over the complexity of this evaluation. However, for a large majority of code available online, the data pipeline is often specific to one dataset, and testing on another dataset requires significant rework. We introduce Torchmeta, a library built on top of PyTorch that enables seamless and consistent evaluation of meta-learning algorithms on multiple datasets, by providing data-loaders for most of the standard benchmarks in few-shot classification and regression, with a new meta-dataset abstraction. It also features some extensions for PyTorch to simplify the development of models compatible with meta-learning algorithms. The code is available here: this https URL