A collaboration with Stony Brook Medicine to Build a COVID-19 Severity Prediction Tool

Mila > The Mila Blog > Research > A collaboration with Stony Brook Medicine to Build a COVID-19 Severity Prediction Tool
A collaboration with Stony Brook Medicine to Build a COVID-19 Severity Prediction Tool
by Joseph Paul Cohen, Lan Dao, Paul Morrison

Figure 1: An example portable X-ray unit. Image credit: auntminne

Editorial Note: The work presented in this post is undergoing peer-review, and has been made available for consultation and research purposes. It should not be used to guide clinical practice until results from a medical trial verify its efficacy.

In recent months, the need to streamline patient management for COVID-19 has become more pressing than ever. The increased strain caused by the pandemic on healthcare systems worldwide has prompted many physicians to resort to new strategies and technologies. A team at Mila  led by Joseph Paul Cohen including Lan Dao, Paul Morrison, and Yoshua Bengio along with a team at the Vector Institute led by Marzyeh Ghassemi including, Karsten Roth, Laleh Seyye-Kalantari, and Parsa Torabian have been working to develop these technologies.

Why Chest X-Rays?

Chest X-rays (CXRs) provide a quick, non-invasive and potentially bedside tool to monitor the progression of the disease in addition to exposing patients to lower doses of radiation than computed tomography (CT), which is another popular imaging tool.

As early as March 2020, hospitals in China used artificial intelligence (AI)-assisted CT imaging analysis to streamline COVID-19 patient care. Many teams have since launched AI initiatives to improve triaging of COVID-19 patients (i.e., discharge, general admission or transfer to the intensive care unit) and allocation of hospital resources (i.e., direct non-invasive ventilation to invasive ventilation). Only recently have practically deployable CXR-based models made their appearance in conjunction with these clinical data-based tools. 

Although AI-assisted tools might be effective, they do not supersede clinical judgment. Premature implantation in hospitals can be avoided by robust evaluation along several practical axes; in this regard, proper clinical trials allow researchers to accurately assess the diagnostic performance of their models. Even in the urgency of a pandemic, implementing such tools before clinical validation makes it impossible to evaluate whether or not they are saving lives and improving care or, on the contrary, are detrimental to patients and don’t confer any net gain.

An Intuitive Collaboration

Partnering with Drs. Tim Duong and Haifang Li at Stony Brook Medicine allows us to have a strong clinical impact. As a hospital on the forefront of the COVID-19 battle, Stony Brook provides a test platform to trial algorithms and a large team to label and identify how our algorithms can fit into clinical practice. This not only allows us to build and test models, but also ensures further reproducibility and comparability between teams. In a medical landscape where diagnostic algorithms are progressively being deserted for predictive ones, Stony Brook and Mila Medical are striving for the implementation of our predictive model in a clinical setting. This would reduce risk of overfitting and bias and permit accurate performance estimation.

In a recent study, as shown in Figure 2, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images (not intended for medical use yet) based on our open COVID-19 CXR dataset. This study finds that training a regression model on a subset of the outputs from a pre-trained chest X-ray model predicts our geographic extent score (range 0-8, 8 being the most severe) with 1.14 mean absolute error (MAE) and our lung opacity score (range 0-6, 6 being the most severe) with 0.78 MAE. Such a tool can gauge severity of COVID-19 pneumonia, which can be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the intensive care unit (ICU). 

Figure 2: Example predictions of geographic extent and opacity targets as well as saliency maps showing where in the image could change the prediction. Note saliency maps are just an approximation of what the model uses to predict but often provides insight.

Assistive Patient Management

Patients’ CXRs could be scored regularly after diagnosis in order to monitor evolution of the disease and treatment response. Eventually, these CXRs used for tracking could be uploaded to our dataset, allowing researchers to design predictive tools and better understand recovery. Our severity score could also be used as an objective and quantitative tool to study response to different treatments and management algorithms, inspiring better management strategies.

Management of patients in the ICU can be assisted using a model which predicts the severity of COVID-19 pneumonia and pneumonia in general based on CXR. This has already been done by other teams using non-ML methods to create a score-based predictive model for transfer to the ICU that combines information from CT scans and CXR with non-imaging data.

Given a sufficiently expressive representation, patients can be plotted as shown in Figure 3, which presents a conceptual figure (Figure 3a) as well as our current realization (Figure 3b). These illustrations were created using a pre-trained CXR model and show examples of available trajectories and patient outcomes. Such an approach could allow us to explore the learned representation by iterating quickly with a medical team and ultimately make sense of the complexities of the model and patients.

Our upcoming research efforts will be dedicated to the improvement of the predictive model to include trajectory.

Figure 3: A Uniform Manifold Approximation and Projection (UMAP) visualization of each CXR from our open COVID-19 image data collection together with the Kaggle RSNA Pneumonia Challenge images. CXRs with a trajectory are shown with an arrow between timepoints. If survival outcome is known, the arrows and/or points are colored. The background is colored based on the density of patients in the ICU or who are intubated.

Steps to Deployment

With initial work in peer review we have made our prototype available for trial (not for medical use) in the torchxrayvison library. We hope many groups will evaluate this model to determine if it is effective. To enable follow up work, we make our code, labels, and data available online. 

This blog post is based on our recent paper:

Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning
Joseph Paul Cohen, Lan Dao, Paul Morrison, Karsten Roth, Yoshua Bengio, Beiyi Shen, Almas Abbasi, Mahsa Hoshmand-Kochi, Marzyeh Ghassemi, Haifang Li, Tim Q Duong
Code: https://github.com/mlmed/torchxrayvision/tree/master/scripts/covid-severity


This research is based on work partially supported by the CIFAR AI and COVID-19 Catalyst Grants.


Gozes, O., Frid-Adar, M., Greenspan, H., Browning, P. D., Zhang, H., Ji, W., Bernheim, A., & Siegel, E. (2020). Rapid AI Development Cycle for the Coronavirus (COVID-19) Pandemic: Initial Results for Automated Detection & Patient Monitoring using Deep Learning CT Image Analysis. ArXiv:2003.05037 [Cs, Eess]. http://arxiv.org/abs/2003.05037

Strickland, E. (2020). AI Can Help Hospitals Triage COVID-19 Patients. IEEE Spectrum. https://spectrum.ieee.org/the-human-os/artificial-intelligence/medical-ai/ai-can-help-hospitals-triage-covid19-patients

Wittbold, K. A., Carroll, C., Iansiti, M., Zhang, H. M., & Landman, A. B. (2020). How Hospitals Are Using AI to Battle Covid-19. Harvard Business Review. https://hbr.org/2020/04/how-hospitals-are-using-ai-to-battle-covid-19

Toussie, D., Voutsinas, N., Finkelstein, M., Cedillo, M. A., Manna, S., Maron, S. Z., Jacobi, A., Chung, M., Bernheim, A., Eber, C., Concepcion, J., Fayad, Z., & Gupta, Y. S. (2020). Clinical and Chest Radiography Features Determine Patient Outcomes In Young and Middle Age Adults with COVID-19. Radiology, 201754. https://doi.org/10.1148/radiol.2020201754

Borghesi, A., Zigliani, A., Golemi, S., Carapella, N., Maculotti, P., Farina, D., & Maroldi, R. (2020). Chest X-ray severity index as a predictor of in-hospital mortality in coronavirus disease 2019: A study of 302 patients from Italy. International Journal of Infectious Diseases, 96, 291–293. https://doi.org/10.1016/j.ijid.2020.05.021

Wynants, L., Van Calster, B., Collins, G. S., Riley, R. D., Heinze, G., Schuit, E., Bonten, M. M. J., Damen, J. A. A., Debray, T. P. A., De Vos, M., Dhiman, P., Haller, M. C., Harhay, M. O., Henckaerts, L., Kreuzberger, N., Lohmann, A., Luijken, K., Ma, J., Andaur Navarro, C. L., … van Smeden, M. (2020). Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal. BMJ, m1328. https://doi.org/10.1136/bmj.m1328

Cohen, J. P., Dao, L., Morrison, P., Roth, K., Bengio, Y., Shen, B., Abbasi, A., Hoshmand-Kochi, M., Ghassemi, M., Li, H., & Duong, T. Q. (2020). Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning. https://arxiv.org/pdf/2005.11856.pdf

Cohen, J. P., Morrison, P., & Dao, L. (2020). COVID-19 Image Data Collection. https://arxiv.org/abs/2003.11597

Cohen, J. P., Morrison, P., Dao, L., Roth, K., Duong, T. Q., & Ghassemi, M. (2020). COVID-19 Image Data Collection: Prospective Predictions Are the Future. https://arxiv.org/abs/2006.11988

Cohen, J. P., Viviano, J., Hashir, M., & Bertrand, H. (2020). TorchXRayVision: A library of chest X-ray datasets and models. https://github.com/mlmed/torchxrayvision

Similar articles

by Sébastien Lachapelle, Divyat Mahajan, Ioannis Mitliagkas, Simon Lacoste-Julien
by Mingde Harry Zhao, Safa Alver, Harm van Seijen, Romain Laroche, Doina Precup, Yoshua Bengio
by Arnab mondal, Siba-Smarak Panigrahi, Sai Rajeswar Mudumba