Portrait of Yue Li

Yue Li

Associate Academic Member
Assistant Professor, McGill University, School of Computer Science


I completed my PhD degree in computer science and computational biology at the University of Toronto in 2014. Prior to joining McGill University, I was a postdoctoral associate at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT (2015–2018).

In general, my research program covers three main research areas that involve applied machine learning in computational genomics and health. More specifically, it focuses on developing interpretable probabilistic learning models and deep learning models to model genetic, epigenetic, electronic health record and single-cell genomic data.

By systematically integrating multimodal and longitudinal data, I aim to have impactful applications in computational medicine, including building intelligent clinical recommender systems, forecasting patient health trajectories, making personalized polygenic risk predictions, characterizing multi-trait functional genetic mutations, and dissecting cell-type-specific regulatory elements that underpin complex traits and diseases in humans.

Current Students

Undergraduate - McGill University
Master's Research - McGill University
PhD - McGill University
PhD - McGill University
Master's Research - McGill University
Master's Research - McGill University
Co-supervisor :
PhD - McGill University
Principal supervisor :
PhD - McGill University


Multi-ancestry polygenic risk scores using phylogenetic regularization
Elliot Layne
Shadi Zabad
Bidirectional Generative Pre-training for Improving Time Series Representation Learning
Ziyang Song
Qincheng Lu
He Zhu
Machine Learning Informed Diagnosis for Congenital Heart Disease in Large Claims Data Source
Ariane Marelli
Chao Li
Aihua Liu
Hanh Nguyen
Harry Moroz
James M. Brophy
Liming Guo
MiRGraph: A transformer-based feature learning approach to identify microRNA-target interactions by integrating heterogeneous graph network and sequence information
Pei Liu
Ying Liu
Jiawei Luo
MicroRNAs (miRNAs) play a crucial role in the regulation of gene expression by targeting specific mRNAs. They can function as both tumor sup… (see more)pressors and oncogenes depending on the specific miRNA and its target genes. Detecting miRNA-target interactions (MTIs) is critical for unraveling the complex mechanisms of gene regulation and identifying therapeutic targets and diagnostic markers. There is currently a lack of MTIs prediction method that simultaneously performs feature learning on heterogeneous graph network and sequence information. To improve the prediction performance of MTIs, we present a novel transformer-based multi-view feature learning method, named MiRGraph. It consists of two main modules for learning the sequence and heterogeneous graph network, respectively. For learning the sequence-based feaature embedding, we utilize the mature miRNA sequence and the complete 3’UTR sequence of the target mRNAs to encode sequence features. Specifically, a transformer-based CNN (TransCNN) module is designed for miRNAs and genes respectively to extract their personalized sequence features. For learning the network-based feature embedding, we utilize a heterogeneous graph transformer (HGT) module to extract the relational and structural information in a heterogeneous graph consisting of miRNA-miRNA, gene-gene and miRNA-target interactions. We learn the TransCNN and HGT modules end-to-end by utilizing a feedforward network, which takes the combined embedded features of the miRNA-gene pair to predict MTIs. Comparisons with other existing MTIs prediction methods illustrates the superiority of MiRGraph under standard criteria. In a case study on breast cancer, we identified plausible target genes of an oncomir hsa-MiR-122-5p and plausible miRNAs that regulate the oncogene BRCA1.
GFETM: Genome Foundation-based Embedded Topic Model for scATAC-seq Modeling
Yimin Fan
Yu Li
Extrapolatable Transformer Pre-training for Ultra Long Time-Series Forecasting
Ziyang Song
Qincheng Lu
Hao Xu
Differential Chromatin Architecture and Risk Variants in Deep Layer Excitatory Neurons and Grey Matter Microglia Contribute to Major Depressive Disorder
Anjali Chawla
Doruk Cakmakci
Wenmin Zhang
Malosree Maitra
Reza Rahimian
Haruka Mitsuhashi
MA Davoli
Jenny Yang
Gary Gang Chen
Ryan Denniston
Deborah Mash
Naguib Mechawar
Matthew Suderman
Corina Nagy
Gustavo Turecki
GTM-decon: guided-topic modeling of single-cell transcriptomes enables sub-cell-type and disease-subtype deconvolution of bulk transcriptomes
Lakshmipuram Seshadri Swapna
Michael Huang
Guided-topic modelling of single-cell transcriptomes enables sub-cell-type and disease-subtype deconvolution of bulk transcriptomes
Lakshmipuram Seshadri Swapna
Michael Huang
Cell-type composition is an important indicator of health. We present Guided Topic Model for deconvolution (GTM-decon) to automatically infe… (see more)r cell-type-specific gene topic distributions from single-cell RNA-seq data for deconvolving bulk transcriptomes. GTM-decon performs competitively on deconvolving simulated and real bulk data compared with the state-of-the-art methods. Moreover, as demonstrated in deconvolving disease transcriptomes, GTM-decon can infer multiple cell-type-specific gene topic distributions per cell type, which captures sub-cell-type variations. GTM-decon can also use phenotype labels from single-cell or bulk data as a guide to infer phenotype-specific gene distributions. In a nested-guided design, GTM-decon identified cell-type-specific differentially expressed genes from bulk breast cancer transcriptomes.
Biomedical discovery through the integrative biomedical knowledge hub (iBKH).
Chang Su
Yu Hou
Manqi Zhou
Suraj Rajendran
Jacqueline R.M. A. Maasch
Zehra Abedi
Haotan Zhang
Zilong Bai
Anthony Cuturrufo
Winston Guo
Fayzan F. Chaudhry
Gregory Ghahramani
Feixiong Cheng
Rui Zhang
Steven T. DeKosky
Jiang Bian
Fei Wang
Single-cell multi-omic topic embedding reveals cell-type-specific and COVID-19 severity-related immune signatures
Manqi Zhou
Hao Zhang
Zilong Bai
Dylan Mann-Krzisnik
Fei Wang
The advent of single-cell multi-omics sequencing technology makes it possible for re-searchers to leverage multiple modalities for individua… (see more)l cells and explore cell heterogeneity. However, the high dimensional, discrete, and sparse nature of the data make the downstream analysis particularly challenging. Most of the existing computational methods for single-cell data analysis are either limited to single modality or lack flexibility and interpretability. In this study, we propose an interpretable deep learning method called multi-omic embedded topic model (moETM) to effectively perform integrative analysis of high-dimensional single-cell multimodal data. moETM integrates multiple omics data via a product-of-experts in the encoder for efficient variational inference and then employs multiple linear decoders to learn the multi-omic signatures of the gene regulatory programs. Through comprehensive experiments on public single-cell transcriptome and chromatin accessibility data (i.e., scRNA+scATAC), as well as scRNA and proteomic data (i.e., CITE-seq), moETM demonstrates superior performance compared with six state-of-the-art single-cell data analysis methods on seven publicly available datasets. By applying moETM to the scRNA+scATAC data in human bone marrow mononuclear cells (BMMCs), we identified sequence motifs corresponding to the transcription factors that regulate immune gene signatures. Applying moETM analysis to CITE-seq data from the COVID-19 patients revealed not only known immune cell-type-specific signatures but also composite multi-omic biomarkers of critical conditions due to COVID-19, thus providing insights from both biological and clinical perspectives.
Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
Yuesong Zou
Ahmad Pesaranghader
Ziyang Song
Aman Verma