Portrait of Yue Li

Yue Li

Associate Academic Member
Assistant Professor, McGill University, School of Computer Science
Research Topics
AI in Health
Bayesian Models
Computational Biology
Deep Learning
Genetics
Large Language Models (LLM)
Multimodal Learning
Single-Cell Genomics

Biography

I completed my PhD degree in computer science and computational biology at the University of Toronto in 2014. Prior to joining McGill University, I was a postdoctoral associate at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT (2015–2018).

In general, my research program covers three main research areas that involve applied machine learning in computational genomics and health. More specifically, it focuses on developing interpretable probabilistic learning models and deep learning models to model genetic, epigenetic, electronic health record and single-cell genomic data.

By systematically integrating multimodal and longitudinal data, I aim to have impactful applications in computational medicine, including building intelligent clinical recommender systems, forecasting patient health trajectories, making personalized polygenic risk predictions, characterizing multi-trait functional genetic mutations, and dissecting cell-type-specific regulatory elements that underpin complex traits and diseases in humans.

Current Students

Postdoctorate - McGill University
PhD - McGill University
PhD - McGill University
Master's Research - McGill University
Master's Research - McGill University
Master's Research - McGill University
PhD - McGill University
Principal supervisor :
PhD - McGill University
Master's Research - McGill University
Principal supervisor :
PhD - McGill University
PhD - McGill University
Co-supervisor :
Master's Research - McGill University
Co-supervisor :
Master's Research - McGill University
Postdoctorate - McGill University
Co-supervisor :

Publications

PheCode-guided multi-modal topic modeling of electronic health records improves disease incidence prediction and GWAS discovery from UK Biobank
Ziqi Yang
Ziyang Song
Phenome-wide association studies rely on disease definitions derived from diagnostic codes, often failing to leverage the full richness of e… (see more)lectronic health records (EHR). We present MixEHR-SAGE, a PheCode-guided multi-modal topic model that integrates diagnoses, procedures, and medications to enhance phenotyping from large-scale EHRs. By combining expert-informed priors with probabilistic inference, MixEHR-SAGE identifies over 1000 interpretable phenotype topics from UK Biobank data. Applied to 350 000 individuals with high-quality genetic data, MixEHR-SAGE-derived risk scores accurately predict incident type 2 diabetes (T2D) and leukemia diagnoses. Subsequent genome-wide association studies using these continuous risk scores uncovered novel disease-associated loci, including PPP1R15A for T2D and JMJD6/SRSF2 for leukemia, that were missed by traditional binary case definitions. These results highlight the potential of probabilistic phenotyping from multi-modal EHRs to improve genetic discovery. The MixEHR-SAGE software is publicly available at: https://github.com/li-lab-mcgill/MixEHR-SAGE.
SpaTM: Topic Models for Inferring Spatially Informed Transcriptional Programs
Wenqi Dong
Qihuang Zhang
Robert Sladek
Spatial transcriptomics has revolutionized our ability to characterize tissues and diseases by contextualizing gene expression with spatial … (see more)organization. Available methods require researchers to either train a model using histology-based annotations or use annotation-free clustering approaches to uncover spatial domains. However, few methods provide researchers with a way to jointly analyze spatial data from both annotation-free and annotation-guided perspectives using consistent inductive biases and levels of interpretability. A single framework with consistent inductive biases ensures coherence and transferability across tasks, reducing the risks of conflicting assumptions. To this end, we propose the Spatial Topic Model (SpaTM), a topic-modeling framework capable of annotation-guided and annotation-free analysis of spatial transcriptomics data. SpaTM can be used to learn gene programs that represent histology-based annotations while providing researchers with the ability to infer spatial domains with an annotation-free approach if manual annotations are limited or noisy. We demonstrate SpaTM’s interpretability with its use of topic mixtures to represent cell states and transcriptional programs and how its intuitive framework facilitates the integration of annotation-guided and annotation-free analyses of spatial data with downstream analyses such as cell type deconvolution. Finally, we demonstrate how both approaches can be used to extend the analysis of large-scale snRNA-seq atlases with the inference of cell proximity and spatial annotations in human brains with Major Depressive Disorder.
MiRformer: a dual-transformer-encoder framework for predicting microRNA-mRNA interactions from paired sequences
MicroRNAs (miRNAs) are small non-coding RNAs that regulate genes by binding to target messenger RNAs (mRNAs), causing them to degrade or sup… (see more)pressing their translation. Accurate prediction of miRNA–mRNA interactions is crucial for RNA therapeutics. Existing methods rely on handcrafted features, struggle to scale to kilobase-long mRNA sequences, or lack interpretability. We introduce MiRformer , a transformer framework designed to predict not only the binary miRNA–mRNA interaction but also the start and end location of the miRNA binding site in the mRNA sequence. MiRformer employs a dual-transformer encoder architecture to learn interaction patterns directly from raw miRNA-mRNA sequence pairs via the cross-attention between the miRNA-encoder and mRNA-encoder. To scale to long mRNA sequences, we leverage sliding-window attention mechanism. MiR-former achieves state-of-the-art performance across diverse miRNA–mRNA tasks, including binding prediction, target-site localization, and cleavage-site identification from Degradome sequencing data. The learned transformer attention are highly interpretable and reveals highly contrasting signals for the miRNA seed regions in 500-nt long mRNA sequences. We used MiRformer to simultaneously predict novel binding sites and cleavage sites in 13k miRNA-mRNA pairs and observed that the two types of sites tend to be close to each other, supporting miRNA-mediated degradation mechanism. Our code is available at https://github.com/li-lab-mcgill/miRformer .
TimelyGPT: Extrapolatable Transformer Pre-training for Long-term Time-Series Forecasting in Healthcare
Ziyang Song
Qincheng Lu
Hao Xu
Ziqi Yang
Mike He Zhu
Motivation: Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success in Natural Language Processing a… (see more)nd Computer Vision domains. However, the development of PTMs on healthcare time-series data is lagging behind. This underscores the limitations of the existing transformer-based architectures, particularly their scalability to handle large-scale time series and ability to capture long-term temporal dependencies. Methods: In this study, we present Timely Generative Pre-trained Transformer (TimelyGPT). TimelyGPT employs an extrapolatable position (xPos) embedding to encode trend and periodic patterns into time-series representations. It also integrates recurrent attention and temporal convolution modules to effectively capture global-local temporal dependencies. Materials: We evaluated TimelyGPT on two large-scale healthcare time series datasets corresponding to continuous biosignals and irregularly-sampled time series, respectively: (1) the Sleep EDF dataset consisting of over 1.2 billion timesteps; (2) the longitudinal healthcare administrative database PopHR, comprising 489,000 patients randomly sampled from the Montreal population. Results: In forecasting continuous biosignals, TimelyGPT achieves accurate extrapolation up to 6,000 timesteps of body temperature during the sleep stage transition, given a short look-up window (i.e., prompt) containing only 2,000 timesteps. For irregularly-sampled time series, TimelyGPT with a proposed time-specific inference demonstrates high top recall scores in predicting future diagnoses using early diagnostic records, effectively handling irregular intervals between clinical records. Together, we envision TimelyGPT to be useful in various health domains, including long-term patient health state forecasting and patient risk trajectory prediction. Availability: The open-sourced code is available at Github.
Timelygpt: extrapolatable transformer pre-training for long-term time-series forecasting in healthcare
Ziyang Song
Qincheng Lu
Hao Xu
Ziqi Yang
Mike He Zhu
TrajGPT: Irregular Time-Series Representation Learning of Health Trajectory.
Ziyang Song
Qincheng Lu
Mike He Zhu
In the healthcare domain, time-series data are often irregularly sampled with varying intervals through outpatient visits, posing challenges… (see more) for existing models designed for equally spaced sequential data. To address this, we propose Trajectory Generative Pre-trained Transformer (TrajGPT) for representation learning on irregularly-sampled healthcare time series. TrajGPT introduces a novel Selective Recurrent Attention (SRA) module that leverages a data-dependent decay to adaptively filter irrelevant past information. As a discretized ordinary differential equation (ODE) framework, TrajGPT captures underlying continuous dynamics and enables a time-specific inference for forecasting arbitrary target timesteps without auto-regressive prediction. Experimental results based on the longitudinal EHR data PopHR from Montreal health system and eICU from PhysioNet showcase TrajGPT's superior zero-shot performance in disease forecasting, drug usage prediction, and sepsis detection. The inferred trajectories of diabetic and cardiac patients reveal meaningful comorbidity conditions, underscoring TrajGPT as a useful tool for forecasting patient health evolution.
TrajGPT: Irregular Time-Series Representation Learning of Health Trajectory.
Ziyang Song
Qincheng Lu
Mike He Zhu
In the healthcare domain, time-series data are often irregularly sampled with varying intervals through outpatient visits, posing challenges… (see more) for existing models designed for equally spaced sequential data. To address this, we propose Trajectory Generative Pre-trained Transformer (TrajGPT) for representation learning on irregularly-sampled healthcare time series. TrajGPT introduces a novel Selective Recurrent Attention (SRA) module that leverages a data-dependent decay to adaptively filter irrelevant past information. As a discretized ordinary differential equation (ODE) framework, TrajGPT captures underlying continuous dynamics and enables a time-specific inference for forecasting arbitrary target timesteps without auto-regressive prediction. Experimental results based on the longitudinal EHR data PopHR from Montreal health system and eICU from PhysioNet showcase TrajGPT's superior zero-shot performance in disease forecasting, drug usage prediction, and sepsis detection. The inferred trajectories of diabetic and cardiac patients reveal meaningful comorbidity conditions, underscoring TrajGPT as a useful tool for forecasting patient health evolution.
Single-nucleus chromatin accessibility profiling identifies cell types and functional variants contributing to major depression
Anjali Chawla
Laura M. Fiori
Wenmin Zang
Malosree Maitra
Jennie Yang
Dariusz Żurawek
Gabriella Frosi
Reza Rahimian
Haruka Mitsuhashi
Maria Antonietta Davoli
Ryan Denniston
Gary Gang Chen
Volodymyr Yerko
Deborah Mash
Kiran Girdhar
Schahram Akbarian
Naguib Mechawar
Matthew Suderman
Corina Nagy
Gustavo Turecki
Single-nucleus chromatin accessibility profiling identifies cell types and functional variants contributing to major depression.
Anjali Chawla
Laura M. Fiori
Wenmin Zang
Malosree Maitra
Jennie Yang
Dariusz Żurawek
Gabriella Frosi
Reza Rahimian
Haruka Mitsuhashi
MA Davoli
Ryan Denniston
Gary Gang Chen
V. Yerko
Deborah Mash
Kiran Girdhar
S. Akbarian
Naguib Mechawar
Matthew Suderman
Corina Nagy
Gustavo Turecki
Single-nucleus chromatin accessibility profiling identifies cell types and functional variants contributing to major depression
Anjali Chawla
Laura M. Fiori
Wenmin Zang
Malosree Maitra
Jennie Yang
Dariusz Żurawek
Gabriella Frosi
Reza Rahimian
Haruka Mitsuhashi
Maria Antonietta Davoli
MA Davoli
Ryan Denniston
Gary Gang Chen
Volodymyr Yerko
Deborah Mash
Kiran Girdhar
Schahram Akbarian
Naguib Mechawar
Matthew Suderman … (see 3 more)
Corina Nagy
Gustavo Turecki
Toward whole-genome inference of polygenic scores with fast and memory-efficient algorithms.
Chirayu Anant Haryan
Simon Gravel
Sanchit Misra
Harnessing agent-based frameworks in CellAgentChat to unravel cell-cell interactions from single-cell and spatial transcriptomics