Portrait de Archer Yang

Archer Yang

Membre académique associé
Professeur agrégé, McGill University, Département de mathématiques et statistiques
Sujets de recherche
Apprentissage automatique en génomique et en santé
Apprentissage profond
Apprentissage sur graphes
Découverte de médicaments
Méthodes de réduction de la dimensionnalité
Modèles probabilistes
Statistiques en haute dimension
Théorie de l'apprentissage automatique

Biographie

Je suis professeur agrégé au Département de mathématiques et de statistiques de l'Université McGill et membre associé de l'École d'informatique et du programme des Sciences quantitatives de la vie.

Je travaille sur la théorie et les méthodes d'apprentissage automatique statistique et de la quantification de l’incertitude, avec un accent particulier sur la biomédecine et la découverte de médicaments :

- Apprentissage automatique statistique

- Inférence en grande dimension

- Quantification de l’incertitude et IA fiable

- Statistiques computationnelles et algorithmes scalables

- Science des données biomédicales et biochimiques

- IA pour la découverte de médicaments

Étudiants actuels

Maîtrise recherche - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Doctorat - McGill
Doctorat - McGill
Baccalauréat - McGill
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Baccalauréat - McGill
Postdoctorat - McGill
Superviseur⋅e principal⋅e :

Publications

Multivariate Conformal Selection
Tian Bai
Yue Zhao
Xiang Yu
Selecting high-quality candidates from large datasets is critical in applications such as drug discovery, precision medicine, and alignment … (voir plus)of large language models (LLMs). While Conformal Selection (CS) provides rigorous uncertainty quantification, it is limited to univariate responses and scalar criteria. To address this, we propose Multivariate Conformal Selection (mCS), a generalization of CS designed for multivariate response settings. Our method introduces regional monotonicity and employs multivariate nonconformity scores to construct conformal
Sparse Polygenic Risk Score Inference with the Spike-and-Slab LASSO
Junyi Song
Simon Gravel
Yuemei Li
Abstract 4142894: Multimorbidity Trajectories Across the Lifespan in Patients with Congenital Heart Disease
Chao Li
Aihua Liu
Solomon Bendayan
Liming Guo
Judith Therrien
Robyn Tamblyn
Jay Brophy
Yuemei Li
Ariane Marelli
Background: Befitted from advances in medical care, patients with congenital heart disease (CHD) now survive to adulthood but face elevated… (voir plus) risks of both cardiac and non-cardiac complications. Understanding the trajectories of comorbidity development over a patient's lifespan is cornerstone to optimize care expected to improve long-term health outcomes. Research Aim: This study aims to investigate the temporal sequences and evolution of comorbidities in CHD patients across their lifespan. We hypothesize that multimorbidity trajectories in CHD patients are linked to CHD lesion severity and age at onset of specific comorbidities. Methods: Using the Quebec CHD database which comprised data in outpatient visits, hospitalization records and vital status from 1983 to 2017, we designed a longitudinal cohort study evaluating the development of 39 comorbidities coded using ICD-9/10. Temporal sequences were mapped using median age of onset. Associations between disease pairs were quantified by hazard ratios from Cox proportional hazard models adjusting for age, sex, genetic syndrome, competing risks of death, and taking into account the time-varying nature of the predictor diseases. Results: The cohort included 9,764 individuals with severe and 127,729 with non-severe CHD lesions. In severe CHD patients, most comorbidities developed between ages 25 and 40. Comorbidity progression began with childhood cardiovascular diseases, followed by systemic diseases such as diabetes, liver and kidney diseases, and advanced to heart failure and dementia in middle adulthood. In addition, mental disorders emerged in early adulthood and were associated with subsequent development of kidney diseases and dementia. Different trajectories were observed in non-severe CHD patients with 2-3 decades later disease onsets and non-differential onsets between cardiovascular and systemic complications (Figure). Conclusions: Distinct multimorbidity trajectories were observed in CHD patients by CHD lesion severity. In patients with severe CHD lesions, early systemic diseases significantly influenced subsequent complications. These findings highlight the need for well-timed surveillance guidelines and interventions to improve health outcomes.
Privacy-preserving analysis of time-to-event data under nested case-control sampling
Lamin Juwara
Ana M Velly
Paramita Saha-Chaudhuri
A Tweedie Compound Poisson Model in Reproducing Kernel Hilbert Space
Yi Lian
Boxiang Wang
Peng Shi
Robert William Platt
Abstract Tweedie models can be used to analyze nonnegative continuous data with a probability mass at zero. There have been wide application… (voir plus)s in natural science, healthcare research, actuarial science, and other fields. The performance of existing Tweedie models can be limited on today’s complex data problems with challenging characteristics such as nonlinear effects, high-order interactions, high-dimensionality and sparsity. In this article, we propose a kernel Tweedie model, Ktweedie, and its sparse variant, SKtweedie, that can simultaneously address the above challenges. Specifically, nonlinear effects and high-order interactions can be flexibly represented through a wide range of kernel functions, which is fully learned from the data; In addition, while the Ktweedie can handle high-dimensional data, the SKtweedie with integrated variable selection can further improve the interpretability. We perform extensive simulation studies to justify the prediction and variable selection accuracy of our method, and demonstrate the applications in ratemaking and loss-reserving in general insurance. Overall, the Ktweedie and SKtweedie outperform existing Tweedie models when there exist nonlinear effects and high-order interactions, particularly when the dimensionality is high relative to the sample size. The model is implemented in an efficient and user-friendly R package ktweedie (https://cran.r-project.org/package=ktweedie).