Portrait de Yue Li

Yue Li

Membre académique associé
Professeur adjoint, McGill University, École d'informatique
Sujets de recherche
Apprentissage multimodal
Apprentissage profond
Biologie computationnelle
Génétique
Génomique unicellulaire
Grands modèles de langage (LLM)
IA en santé
Modèles bayésiens

Biographie

J'ai obtenu un doctorat en informatique et biologie computationnelle de l'Université de Toronto en 2014. Avant de me joindre à l’Université McGill, j'ai été associé postdoctoral au Computer Science and Artificial Intelligence Laboratory (CSAIL) du Massachusetts Institute of Technology (MIT) (2015-2018).

Mes recherches portent sur le développement de modèles d'apprentissage probabilistes interprétables et de modèles d'apprentissage profond pour modéliser les données génétiques et épigénétiques, les dossiers de santé électroniques et les données génomiques unicellulaires.

En intégrant systématiquement des données multimodales et longitudinales, je cherche à obtenir des applications qui auront des effets tangibles en médecine computationnelle, y compris la construction de systèmes de recommandation clinique intelligents, la prévision des trajectoires de santé des patients, les prédictions personnalisées de risques polygéniques, la caractérisation des mutations génétiques fonctionnelles multitraits, et la dissection des éléments réglementaires spécifiques au type de cellule qui sont à la base des traits complexes et des maladies chez l'homme. Mon programme de recherche couvre trois domaines principaux impliquant l'apprentissage automatique appliqué à la génomique computationnelle et à la santé.

Étudiants actuels

Postdoctorat - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Maîtrise recherche - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Doctorat - McGill
Postdoctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill

Publications

Modeling electronic health record data using a knowledge-graph-embedded topic model
The rapid growth of electronic health record (EHR) datasets opens up promising opportunities to understand human diseases in a systematic wa… (voir plus)y. However, effective extraction of clinical knowledge from the EHR data has been hindered by its sparsity and noisy information. We present KG-ETM, an end-to-end knowledge graph-based multimodal embedded topic model. KG-ETM distills latent disease topics from EHR data by learning the embedding from the medical knowledge graphs. We applied KG-ETM to a large-scale EHR dataset consisting of over 1 million patients. We evaluated its performance based on EHR reconstruction and drug imputation. KG-ETM demonstrated superior performance over the alternative methods on both tasks. Moreover, our model learned clinically meaningful graph-informed embedding of the EHR codes. In additional, our model is also able to discover interpretable and accurate patient representations for patient stratification and drug recommendations.
Inferring global-scale temporal latent topics from news reports to predict public health interventions for COVID-19
Zhi Wen
Guido Powell
Imane Chafi
Y. K. Li
Supervised multi-specialist topic model with applications on large-scale electronic health record data
Xavier Sumba Toral
Yixin Xu
Aihua Liu
Liming Guo
Guido Powell
Aman Verma
Ariane Marelli
Motivation: Electronic health record (EHR) data provides a new venue to elucidate disease comorbidities and latent phenotypes for precision … (voir plus)medicine. To fully exploit its potential, a realistic data generative process of the EHR data needs to be modelled. Materials and Methods: We present MixEHR-S to jointly infer specialist-disease topics from the EHR data. As the key contribution, we model the specialist assignments and ICD-coded diagnoses as the latent topics based on patient's underlying disease topic mixture in a novel unified supervised hierarchical Bayesian topic model. For efficient inference, we developed a closed-form collapsed variational inference algorithm to learn the model distributions of MixEHR-S. Results: We applied MixEHR-S to two independent large-scale EHR databases in Quebec with three targeted applications: (1) Congenital Heart Disease (CHD) diagnostic prediction among 154,775 patients; (2) Chronic obstructive pulmonary disease (COPD) diagnostic prediction among 73,791 patients; (3) future insulin treatment prediction among 78,712 patients diagnosed with diabetes as a mean to assess the disease exacerbation. In all three applications, MixEHR-S conferred clinically meaningful latent topics among the most predictive latent topics and achieved superior target prediction accuracy compared to the existing methods, providing opportunities for prioritizing high-risk patients for healthcare services. Availability and implementation: MixEHR-S source code and scripts of the experiments are freely available at https://github.com/li-lab-mcgill/mixehrS
Publisher Correction: The default network of the human brain is associated with perceived social isolation
R. Nathan Spreng
Laetitia Mwilambwe-Tshilobo
Alain Dagher
Philipp Koellinger
Gideon Nave
Anthony Ong
Julius M Kernbach
Thomas V. Wiecki
Tian Ge
Avram J. Holmes
B.T. Thomas Yeo
Gary R. Turner
Robin I. M. Dunbar
The default network of the human brain is associated with perceived social isolation
R. Nathan Spreng
Laetitia Mwilambwe-Tshilobo
Alain Dagher
Philipp Koellinger
Gideon Nave
Anthony Ong
Julius M Kernbach
Thomas V. Wiecki
Tian Ge
Avram J. Holmes
B.T. Thomas Yeo
Gary R. Turner
Robin I. M. Dunbar
Global Surveillance of COVID-19 by mining news media using a multi-source dynamic embedded topic model
Zhi Wen
Imane Chafi
Anya Okhmatovskaia
Guido Powell