Portrait de Jun Ding

Jun Ding

Membre affilié
Professeur adjoint, McGill University, Département de médecine
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Biologie computationnelle

Biographie

Jun Ding est professeur adjoint au Département de médecine de la Faculté de médecine et des sciences de la santé de l'Université McGill. Aux côtés de son équipe, il se consacre à l'utilisation de techniques d'apprentissage automatique pour éclaircir les dynamiques complexes des cellules dans diverses maladies, telles que les troubles du développement, les maladies pulmonaires et les cancers. La nature diversifiée et complexe de ces affections nécessite l’usage d’approches innovantes, incitant à l'utilisation de technologies unicellulaires de pointe. Ces technologies offrent des possibilités sans précédent pour faire avancer la compréhension, notamment dans des domaines tels que la biologie du développement et du cancer. Cependant, elles posent également des défis dans le développement de modèles informatiques capables de relier ces données biomédicales complexes à des découvertes potentielles.

Jun Ding a comme objectif le développement et l'affinement des méthodologies d'apprentissage automatique, en particulier des modèles graphiques probabilistes, pour analyser, modéliser et visualiser efficacement des données omiques à la fois de cellules uniques et de cellules groupées, souvent avec des dimensions longitudinales ou spatiales. Le but de ses recherches est d'utiliser ces techniques avancées d'apprentissage automatique pour approfondir la compréhension des dynamiques cellulaires, afin de développer des stratégies diagnostiques et thérapeutiques novatrices susceptibles de bénéficier considérablement à la santé publique.

Étudiants actuels

Doctorat - McGill
Superviseur⋅e principal⋅e :

Publications

scGALA advances graph link prediction-based cell alignment for comprehensive data integration and harmonization
Guo Jiang
Kailu Song
Gregory J. Fonseca
Darcy E. Wagner
Iain C. Clark
Hui Wang
Single-cell technologies have transformed our understanding of cellular heterogeneity through multimodal data acquisition. However, robust c… (voir plus)ell alignment remains a major challenge for data integration and harmonization, including batch correction, label transfer, and multi-omics integration. Many existing methods constrain alignment based on rigid feature-wise distance metrics, limiting their ability to capture accurate cell correspondence across diverse cell populations and conditions. We introduce scGALA, a graph-based learning framework that redefines cell alignment by combining graph attention networks with a score-driven, task-independent optimization strategy. scGALA constructs enriched graphs of cell-cell relationships by integrating gene expression profiles with auxiliary information, such as spatial coordinates, and iteratively refines alignment via self-supervised graph link prediction, where a deep neural network is trained to identify and reinforce high-confidence correspondences across datasets. In extensive benchmarks, scGALA identifies over 25 percent more high-confidence alignments without compromising accuracy. By improving the core step of cell alignment, scGALA serves as a versatile enhancer for a wide range of single-cell data integration tasks.
DENetwork unveils non-differentially expressed genes with functional relevance across conditions through information flow perturbation
Bowen Zhao
Ting-Yi Su
Jingtao Wang
Quazi S. Islam
Kailu Song
Steven K. Huang
Matthieu Allez
Gregory J. Fonseca
Carolyn J. Baglole
Differential gene expression (DE) analysis of RNA-sequencing (RNA-seq) data is a standard approach for identifying phenotypic differences be… (voir plus)tween conditions. However, traditional DE methods such as DESeq2 focus on expression changes alone, often overlooking non-differentially expressed (non-DE) genes that may play key regulatory roles. This limits their ability to identify upstream drivers of transcriptomic variation. To address this gap, we introduce DENetwork, a network-based approach that prioritizes genes based on their influence on global information flow. Each gene is scored using an in silico knockout strategy that quantifies its impact across the inferred gene network, capturing both DE and non-DE genes with potential functional relevance. DENetwork deciphers intricate regulatory and signaling networks driving transcriptomic variations between conditions with distinct phenotypes. Across simulated and disease-relevant RNA-seq datasets, DENetwork identifies non-DE regulators enriched in known pathways and phenotypic associations, providing mechanistic insights missed by standard DE analysis, with implications for target discovery and intervention.
SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics
Qingtian Zhu
Yumin Zheng
Yuling Sang
Yifan Zhan
Ziyan Zhu
Yinqiang Zheng
Spatial Transcriptomics (ST) is a method that captures gene expression profiles aligned with spatial coordinates. The discrete spatial distr… (voir plus)ibution and the super-high dimensional sequencing results make ST data challenging to be modeled effectively. In this paper, we manage to model ST in a continuous and compact manner by the proposed tool, SUICA, empowered by the great approximation capability of Implicit Neural Representations (INRs) that can enhance both the spatial density and the gene expression. Concretely within the proposed SUICA, we incorporate a graph-augmented Autoencoder to effectively model the context information of the unstructured spots and provide informative embeddings that are structure-aware for spatial mapping. We also tackle the extremely skewed distribution in a regression-by-classification fashion and enforce classification-based loss functions for the optimization of SUICA. By extensive experiments of a wide range of common ST platforms under varying degradations, SUICA outperforms both conventional INR variants and SOTA methods regarding numerical fidelity, statistical correlation, and bio-conservation. The prediction by SUICA also showcases amplified gene signatures that enriches the bio-conservation of the raw data and benefits subsequent analysis. The code is available at https://github.com/Szym29/SUICA.
CellSexID: Sex-Based Computational Tracking of Cellular Origins in Chimeric Models
Huilin Tai
Qian Li
Jingtao Wang
Jiahui Tan
Bowen Zhao
Ryann Lang
Basil J. Petrof
Cell tracking in chimeric models is essential yet challenging, particularly in developmental biology, regenerative medicine, and transplanta… (voir plus)tion studies. Existing methods, such as fluorescent labeling and genetic barcoding, are technically demanding, costly, and often impractical for dynamic, heterogeneous tissues. To address these limitations, we propose a computational framework that leverages sex as a surrogate marker for cell tracking. Our approach uses a machine learning model trained on single-cell transcriptomic data to predict cell sex with high accuracy, enabling clear distinction between donor (male) and recipient (female) cells in sex-mismatched chimeric models. The model identifies specific genes critical for sex prediction and has been validated using public datasets and experimental flow sorting, confirming the biological relevance of the identified cell populations. Applied to skeletal muscle macrophages, our method revealed distinct transcriptional profiles associated with cellular origins. This pipeline offers a robust, cost-effective solution for cell tracking in chimeric models, advancing research in regenerative medicine and immunology by providing precise insights into cellular origins and therapeutic outcomes.
Inhibition of epithelial cell YAP-TEAD/LOX signaling attenuates pulmonary fibrosis in preclinical models
Darcy Elizabeth Wagner
Hani N. Alsafadi
Nilay Mitash
Aurelien Justet
Qianjiang Hu
Ricardo Pineda
Claudia Staab-Weijnitz
Martina Korfei
Nika Gvazava
Kristin Wannemo
Ugochi Onwuka
Molly Mozurak
Adriana Estrada-Bernal
Juan Cala Garcia
Katrin Mutze
Rita Costa
Deniz Bölükbas
John Stegmayr
Wioletta Skronska-Wasek
Stephan Klee … (voir 14 de plus)
Chiharu Ota
Hoeke A. Baarsma
Jingtao Wang
John Sembrat
Anne Hilgendorff
Andreas Günther
Rachel Chambers
Ivan O Rosas
Stijn de Langhe
Naftali Kaminski
Mareike Lehmann
Oliver Eickelberg
Melanie Königshoff
Idiopathic pulmonary fibrosis (IPF) is a progressive and lethal disease characterized by excessive extracellular matrix deposition. Current … (voir plus)IPF therapies slow disease progression but do not stop or reverse it. The (myo)fibroblasts are thought to be the main cellular contributors to excessive extracellular matrix production in IPF. Here we show that fibrotic alveolar type II cells regulate production and crosslinking of extracellular matrix via the co-transcriptional activator YAP. YAP leads to increased expression of Lysl oxidase (LOX) and subsequent LOX-mediated crosslinking by fibrotic alveolar type II cells. Pharmacological YAP inhibition via verteporfin reverses fibrotic alveolar type II cell reprogramming and LOX expression in experimental lung fibrosis in vivo and in human fibrotic tissue ex vivo. We thus identify YAP-TEAD/LOX inhibition in alveolar type II cells as a promising potential therapy for IPF patients.
DOLPHIN advances single-cell transcriptomics beyond gene level by leveraging exon and junction reads
Kailu Song
Yumin Zheng
Bowen Zhao
David H. Eidelman
The advent of single-cell sequencing has revolutionized the study of cellular dynamics, providing unprecedented resolution into the molecula… (voir plus)r states and heterogeneity of individual cells. However, the rich potential of exon-level information and junction reads within single cells remains underutilized. Conventional gene-count methods overlook critical exon and junction data, limiting the quality of cell representation and downstream analyses such as subpopulation identification and alternative splicing detection. We introduce DOLPHIN, a deep learning method that integrates exon-level and junction read data, representing genes as graph structures. These graphs are processed by a variational graph autoencoder to improve cell embeddings. DOLPHIN not only demonstrates superior performance in cell clustering, biomarker discovery, and alternative splicing detection but also provides a distinct capability to detect subtle transcriptomic differences at the exon level that are often masked in gene-level analyses. By examining cellular dynamics with enhanced resolution, DOLPHIN provides new insights into disease mechanisms and potential therapeutic targets.
Harnessing agent-based frameworks in CellAgentChat to unravel cell–cell interactions from single-cell and spatial transcriptomics
Understanding cell–cell interactions (CCIs) is essential yet challenging owing to the inherent intricacy and diversity of cellular dynamic… (voir plus)s. Existing approaches often analyze global patterns of CCIs using statistical frameworks, missing the nuances of individual cell behavior owing to their focus on aggregate data. This makes them insensitive in complex environments where the detailed dynamics of cell interactions matter. We introduce CellAgentChat, an agent-based model (ABM) designed to decipher CCIs from single-cell RNA sequencing and spatial transcriptomics data. This approach models biological systems as collections of autonomous agents governed by biologically inspired principles and rules. Validated across eight diverse single-cell data sets, CellAgentChat demonstrates its effectiveness in detecting intricate signaling events across different cell populations. Moreover, CellAgentChat offers the ability to generate animated visualizations of single-cell interactions and provides flexibility in modifying agent behavior rules, facilitating thorough exploration of both close and distant cellular communications. Furthermore, CellAgentChat leverages ABM features to enable intuitive in silico perturbations via agent rule modifications, facilitating the development of novel intervention strategies. This ABM method unlocks an in-depth understanding of cellular signaling interactions across various biological contexts, thereby enhancing in silico studies for cellular communication–based therapies.
A deep generative model for deciphering cellular dynamics and in silico drug discovery in complex diseases
Yumin Zheng
Jonas C. Schupp
Taylor Adams
Geremy Clair
Aurelien Justet
Farida Ahangari
Xiting Yan
Paul Hansen
Marianne Carlon
Emanuela Cortesi
Marie Vermant
Robin Vos
Laurens J. De Sadeleer
Iván O. Rosas
Ricardo Pineda
John Sembrat
Melanie Königshoff
John E. McDonough
Bart M. Vanaudenaerde
Wim A. Wuyts … (voir 2 de plus)
Naftali Kaminski
Human diseases are characterized by intricate cellular dynamics. Single-cell transcriptomics provides critical insights, yet a persistent ga… (voir plus)p remains in computational tools for detailed disease progression analysis and targeted in silico drug interventions. Here we introduce UNAGI, a deep generative neural network tailored to analyse time-series single-cell transcriptomic data. This tool captures the complex cellular dynamics underlying disease progression, enhancing drug perturbation modelling and screening. When applied to a dataset from patients with idiopathic pulmonary fibrosis, UNAGI learns disease-informed cell embeddings that sharpen our understanding of disease progression, leading to the identification of potential therapeutic drug candidates. Validation using proteomics reveals the accuracy of UNAGI’s cellular dynamics analysis, and the use of the fibrotic cocktail-treated human precision-cut lung slices confirms UNAGI’s predictions that nifedipine, an antihypertensive drug, may have anti-fibrotic effects on human tissues. UNAGI’s versatility extends to other diseases, including COVID, demonstrating adaptability and confirming its broader applicability in decoding complex cellular dynamics beyond idiopathic pulmonary fibrosis, amplifying its use in the quest for therapeutic solutions across diverse pathological landscapes.
Alveolar epithelial cell plasticity and injury memory in human pulmonary fibrosis
Taylor Adams
Jonas C. Schupp
Agshin Balayev
Johad Khoury
A. Justet
Fadi Nikola
Laurens De Sadeleer
Juan Cala-García
Marta Zapata‐Ortega
Panayiotis V. Benos
John E. McDonough
Farida Ahangari
Melanie Königshoff
Robert Homer
Iván O. Rosas
Xiting Yan
Bart Vanaudenaerde
Wim Wuyts
Naftali Kaminski
Acute and repetitive lung epithelial injury can lead to irreversible and even progressive pulmonary fibrosis; Idiopathic pulmonary fibrosis … (voir plus)(IPF) is a fatal disease and quintessential example of this phenomenon. The composition of epithelial cells in human pulmonary fibrosis – irrespective of disease etiology – is marked by the presence of Aberrant Basaloid cells: an abnormal cell phenotype with pro-fibrotic and senescent features, localized to the surface of fibrotic lesions. Despite their relevance to human pulmonary fibrosis, the exotic molecular profile of Aberrant Basaloid cells has obscured their etiology, preventing insights into how or why these cells emerge with fibrosis. Here we identify cellular intermediaries between Aberrant Basaloid and normal alveolar epithelial cells in human IPF tissue. We track the emergence of Aberrant Basaloid cells from alveolar epithelial cells ex vivo and uncover a role for similar cells in epithelial regeneration under normal conditions. Lastly, we characterize the epigenetic changes that distinguish Aberrant Basaloid cells from their progenitors and identify hallmarks of AP-1 injury memory retention. This study elucidates the phenomenon of maladaptive epithelial plasticity and regeneration in pulmonary fibrosis and re-contextualizes therapeutic strategies for epithelial dysfunction.
Advancing global antifungal development to combat invasive fungal infection
Xiu-Li Wang
Koon Ho Wong
Chen Ding
Chang-Bin Chen
Wen-Juan Wu
Ningning Liu
DTractor enhances cell type deconvolution in spatial transcriptomics by integrating deep neural networks, transfer learning, and matrix factorization
Yong Jin Kweon
Chenyu Liu
Gregory Fonseca
Spatial transcriptomics (ST) captures gene expression with spatial context but lacks single-cell resolution. Single-cell RNA sequencing (scR… (voir plus)NA-seq) offers high-resolution profiles without spatial information. Accurate spot-level decomposition requires effective integration of both. We present DTractor, a deep learning-based framework that improves cell-type deconvolution in ST data through spatial constraints and transfer learning. DTractor achieves dual utilization of scRNA-seq reference data by incorporating both a cell-type-specific gene expression matrix and learned latent embeddings into a unified matrix factorization model. This joint modeling enables accurate estimation of cell-type proportions and cell-type-resolved gene expression within each spatial spot, while preserving biological and spatial coherence. DTractor further applies spatial regularization to maintain local tissue structure. Across multiple ST platforms and tissue types, DTractor demonstrates improved decomposition accuracy, robustness, and interpretability compared to existing methods. The results from DTractor support downstream applications such as spatial domain analysis and the study of spatially organized cellular behaviors.
Efficient and scalable construction of clinical variable networks for complex diseases with RAMEN.
Yiwei Xiong
Jingtao Wang
Tingting Chen
Douglas D. Fraser
Gregory Fonseca
Simon Rousseau