Portrait de Archer Yang

Archer Yang

Membre académique associé
Professeur agréré, McGill University, Département de mathématiques et statistiques
Sujets de recherche
Apprentissage automatique en génomique et en santé
Apprentissage profond
Apprentissage sur graphes
Découverte de médicaments
Méthodes de réduction de la dimensionnalité
Modèles probabilistes
Statistiques en haute dimension
Théorie de l'apprentissage automatique

Biographie

Je suis professeur agrégé au Département de mathématiques et de statistiques de l'Université McGill et membre associé de l'École d'informatique et du programme des Sciences quantitatives de la vie.

Mes recherches portent sur trois thèmes interconnectés : l'apprentissage automatique statistique, les applications dans la découverte de médicaments et la génomique computationnelle et les soins de santé. Dans le domaine de l'apprentissage automatique statistique, je me concentre sur le développement de méthodes inspirées par la causalité, la réduction de la dimensionnalité et les modèles probabilistes afin de relever les défis posés par les données complexes à haute dimension. Dans le domaine de la découverte de médicaments, mon travail consiste à développer des modèles d'apprentissage automatique pour accélérer l'identification des médicaments candidats et améliorer la compréhension de l'efficacité et de la sécurité des médicaments. Dans le domaine de la génomique computationnelle et des soins de santé, je développe des techniques pour analyser les données génomiques, identifier les biomarqueurs et explorer la base génétique des maladies, dans le but d'améliorer la médecine de précision et de prédire les résultats pour les patients. Mon principal objectif est de faire le lien entre les méthodologies avancées basées sur les données et les applications ayant un impact sur la pharmacologie, la génomique et les soins de santé.

Les étudiants diplômés intéressés à travailler avec moi sont priés de s'adresser à Mila - Institut québécois d'intelligence artificielle et au Département de mathématiques et de statistique de l'Université McGill. Les candidats peuvent également envisager des possibilités de co-supervision avec des membres du corps professoral du programme d'informatique de McGill.

Étudiants actuels

Doctorat - McGill
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Co-superviseur⋅e :
Doctorat - McGill
Doctorat - McGill

Publications

Abstract 4142894: Multimorbidity Trajectories Across the Lifespan in Patients with Congenital Heart Disease
Chao Li
Aihua Liu
Solomon Bendayan
Liming Guo
Judith Therrien
Robyn Tamblyn
Jay Brophy
Ariane Marelli
Background: Befitted from advances in medical care, patients with congenital heart disease (CHD) now survive to adulthood but face elevated… (voir plus) risks of both cardiac and non-cardiac complications. Understanding the trajectories of comorbidity development over a patient's lifespan is cornerstone to optimize care expected to improve long-term health outcomes. Research Aim: This study aims to investigate the temporal sequences and evolution of comorbidities in CHD patients across their lifespan. We hypothesize that multimorbidity trajectories in CHD patients are linked to CHD lesion severity and age at onset of specific comorbidities. Methods: Using the Quebec CHD database which comprised data in outpatient visits, hospitalization records and vital status from 1983 to 2017, we designed a longitudinal cohort study evaluating the development of 39 comorbidities coded using ICD-9/10. Temporal sequences were mapped using median age of onset. Associations between disease pairs were quantified by hazard ratios from Cox proportional hazard models adjusting for age, sex, genetic syndrome, competing risks of death, and taking into account the time-varying nature of the predictor diseases. Results: The cohort included 9,764 individuals with severe and 127,729 with non-severe CHD lesions. In severe CHD patients, most comorbidities developed between ages 25 and 40. Comorbidity progression began with childhood cardiovascular diseases, followed by systemic diseases such as diabetes, liver and kidney diseases, and advanced to heart failure and dementia in middle adulthood. In addition, mental disorders emerged in early adulthood and were associated with subsequent development of kidney diseases and dementia. Different trajectories were observed in non-severe CHD patients with 2-3 decades later disease onsets and non-differential onsets between cardiovascular and systemic complications (Figure). Conclusions: Distinct multimorbidity trajectories were observed in CHD patients by CHD lesion severity. In patients with severe CHD lesions, early systemic diseases significantly influenced subsequent complications. These findings highlight the need for well-timed surveillance guidelines and interventions to improve health outcomes.
Structured Learning in Time-dependent Cox Models
Guanbo Wang
Yi Lian
Robert W. Platt
Rui Wang
Sylvie Perreault
Marc Dorais
Mireille E. Schnitzer
Machine Learning Informed Diagnosis for Congenital Heart Disease in Large Claims Data Source
Ariane Marelli
Chao Li
Aihua Liu
Hanh Nguyen
Harry Moroz
James M. Brophy
Liming Guo
Privacy-preserving analysis of time-to-event data under nested case-control sampling
Lamin Juwara
Ana M Velly
Paramita Saha-Chaudhuri
Accelerating Generalized Random Forests with Fixed-Point Trees
David L. Fleischer
David A. Stephens
A Tweedie Compound Poisson Model in Reproducing Kernel Hilbert Space
Yi Lian
Boxiang Wang
Peng Shi
Robert William Platt
Abstract Tweedie models can be used to analyze nonnegative continuous data with a probability mass at zero. There have been wide application… (voir plus)s in natural science, healthcare research, actuarial science, and other fields. The performance of existing Tweedie models can be limited on today’s complex data problems with challenging characteristics such as nonlinear effects, high-order interactions, high-dimensionality and sparsity. In this article, we propose a kernel Tweedie model, Ktweedie, and its sparse variant, SKtweedie, that can simultaneously address the above challenges. Specifically, nonlinear effects and high-order interactions can be flexibly represented through a wide range of kernel functions, which is fully learned from the data; In addition, while the Ktweedie can handle high-dimensional data, the SKtweedie with integrated variable selection can further improve the interpretability. We perform extensive simulation studies to justify the prediction and variable selection accuracy of our method, and demonstrate the applications in ratemaking and loss-reserving in general insurance. Overall, the Ktweedie and SKtweedie outperform existing Tweedie models when there exist nonlinear effects and high-order interactions, particularly when the dimensionality is high relative to the sample size. The model is implemented in an efficient and user-friendly R package ktweedie (https://cran.r-project.org/package=ktweedie).