Publications

Pandemic policy assessment by artificial intelligence
Sirui Song
Yong Li
Yang Yu
Pandemic policy assessment by artificial intelligence
Sirui Song
Yong Li
Yang Yu
High Fidelity Visualization of What Your Self-Supervised Representation Knows About
Florian Bordes
Randall Balestriero
Discovering what is learned by neural networks remains a challenge. In self-supervised learning, classification is the most common task used… (voir plus) to evaluate how good a representation is. However, relying only on such downstream task can limit our understanding of what information is retained in the representation of a given input. In this work, we showcase the use of a Representation Conditional Diffusion Model (RCDM) to visualize in data space the representations learned by self-supervised models. The use of RCDM is motivated by its ability to generate high-quality samples -- on par with state-of-the-art generative models -- while ensuring that the representations of those samples are faithful i.e. close to the one used for conditioning. By using RCDM to analyze self-supervised models, we are able to clearly show visually that i) SSL (backbone) representation are not invariant to the data augmentations they were trained with -- thus debunking an often restated but mistaken belief; ii) SSL post-projector embeddings appear indeed invariant to these data augmentation, along with many other data symmetries; iii) SSL representations appear more robust to small adversarial perturbation of their inputs than representations trained in a supervised manner; and iv) that SSL-trained representations exhibit an inherent structure that can be explored thanks to RCDM visualization and enables image manipulation.
Automatic Phenotyping by a Seed-guided Topic Model
Ziyang Song
Yuanyi Hu
Aman Verma
Electronic health records (EHRs) provide rich clinical information and the opportunities to extract epidemiological patterns to understand a… (voir plus)nd predict patient disease risks with suitable machine learning methods such as topic models. However, existing topic models do not generate identifiable topics each predicting a unique phenotype. One promising direction is to use known phenotype concepts to guide topic inference. We present a seed-guided Bayesian topic model called MixEHR-Seed with 3 contributions: (1) for each phenotype, we infer a dual-form of topic distribution: a seed-topic distribution over a small set of key EHR codes and a regular topic distribution over the entire EHR vocabulary; (2) we model age-dependent disease progression as Markovian dynamic topic priors; (3) we infer seed-guided multi-modal topics over distinct EHR data types. For inference, we developed a variational inference algorithm. Using MixEHR-Seed, we inferred 1569 PheCode-guided phenotype topics from an EHR database in Quebec, Canada covering 1.3 million patients for up to 20-year follow-up with 122 million records for 8539 and 1126 unique diagnostic and drug codes, respectively. We observed (1) accurate phenotype prediction by the guided topics, (2) clinically relevant PheCode-guided disease topics, (3) meaningful age-dependent disease prevalence. Source code is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-Seed.
TITRATED: Learned Human Driving Behavior without Infractions via Amortized Inference
Vasileios Lioutas
Adam Ścibior
Heatmap Regression for Lesion Detection using Pointwise Annotations
Chelsea Myers-Colet
Julien Schroeter
Douglas Arnold
In many clinical contexts, detecting all lesions is imperative for evaluating disease activity. Standard approaches pose lesion detection as… (voir plus) a segmentation problem despite the time-consuming nature of acquiring segmentation labels. In this paper, we present a lesion detection method which relies only on point labels. Our model, which is trained via heatmap regression, can detect a variable number of lesions in a probabilistic manner. In fact, our proposed post-processing method offers a reliable way of directly estimating the lesion existence uncertainty. Experimental results on Gad lesion detection show our point-based method performs competitively compared to training on expensive segmentation labels. Finally, our detection model provides a suitable pre-training for segmentation. When fine-tuning on only 17 segmentation samples, we achieve comparable performance to training with the full dataset.
RandomSCM: interpretable ensembles of sparse classifiers tailored for omics data
Thibaud Godon
Pier-Luc Plante
Baptiste Bauvin
Élina Francovic-Fontaine
Jacques Corbeil
Background: Understanding the relationship between the Omics and the phenotype is a central problem in precision medicine. The high dimensio… (voir plus)nality of metabolomics data challenges learning algorithms in terms of scalability and generalization. Most learning algorithms do not produce interpretable models -- Method: We propose an ensemble learning algorithm based on conjunctions or disjunctions of decision rules. -- Results : Applications on metabolomics data shows that it produces models that achieves high predictive performances. The interpretability of the models makes them useful for biomarker discovery and patterns discovery in high dimensional data.
Learning to Improve Code Efficiency
Binghong Chen
Danny Tarlow
Kevin Swersky
Martin Maas
Pablo Heiber
Ashish V Naik
Milad Hashemi
Parthasarathy Ranganathan
Estimating the lagged effect of price discounting: a time-series study on sugar sweetened beverage purchasing in a supermarket
Hiroshi Mamiya
Alexandra M. Schmidt
Erica E. M. Moodie
Counterfactual Image Synthesis for Discovery of Personalized Predictive Image Markers
Amar Kumar
Anjun Hu
Brennan Nichyporuk
Jean-Pierre R. Falet
Douglas Arnold
Sotirios A. Tsaftaris
Application of Artificial Intelligence in Shared Decision Making: Scoping Review
Michelle Cwintal
Yuhui Huang
Pooria Ghadiri
Roland Grad
Genevieve Gore
Hervé Tchala Vignon Zomahoun
France Légaré
Pierre Pluye
Background Artificial intelligence (AI) has shown promising results in various fields of medicine. It has the potential to facilitate shared… (voir plus) decision making (SDM). However, there is no comprehensive mapping of how AI may be used for SDM. Objective We aimed to identify and evaluate published studies that have tested or implemented AI to facilitate SDM. Methods We performed a scoping review informed by the methodological framework proposed by Levac et al, modifications to the original Arksey and O'Malley framework of a scoping review, and the Joanna Briggs Institute scoping review framework. We reported our results based on the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) reporting guideline. At the identification stage, an information specialist performed a comprehensive search of 6 electronic databases from their inception to May 2021. The inclusion criteria were: all populations; all AI interventions that were used to facilitate SDM, and if the AI intervention was not used for the decision-making point in SDM, it was excluded; any outcome related to patients, health care providers, or health care systems; studies in any health care setting, only studies published in the English language, and all study types. Overall, 2 reviewers independently performed the study selection process and extracted data. Any disagreements were resolved by a third reviewer. A descriptive analysis was performed. Results The search process yielded 1445 records. After removing duplicates, 894 documents were screened, and 6 peer-reviewed publications met our inclusion criteria. Overall, 2 of them were conducted in North America, 2 in Europe, 1 in Australia, and 1 in Asia. Most articles were published after 2017. Overall, 3 articles focused on primary care, and 3 articles focused on secondary care. All studies used machine learning methods. Moreover, 3 articles included health care providers in the validation stage of the AI intervention, and 1 article included both health care providers and patients in clinical validation, but none of the articles included health care providers or patients in the design and development of the AI intervention. All used AI to support SDM by providing clinical recommendations or predictions. Conclusions Evidence of the use of AI in SDM is in its infancy. We found AI supporting SDM in similar ways across the included articles. We observed a lack of emphasis on patients’ values and preferences, as well as poor reporting of AI interventions, resulting in a lack of clarity about different aspects. Little effort was made to address the topics of explainability of AI interventions and to include end-users in the design and development of the interventions. Further efforts are required to strengthen and standardize the use of AI in different steps of SDM and to evaluate its impact on various decisions, populations, and settings.
Endorsing Complexity Through Diversity: Computational Psychiatry Meets Big Data Analytics
Jakub Kopal