Portrait of Archer Yang

Archer Yang

Associate Academic Member
Associate professor, McGill University, Department of Mathematics and Statistics
Research Topics
Deep Learning
Dimensionality Reduction Methods
Drug Discovery
High-Dimensional statistics
Learning on Graphs
Machine Learning in Genomics and Healthcare
Machine Learning Theory
Probabilistic Models

Biography

I am an Associate Professor in the Department of Mathematics and Statistics at McGill University, with affiliations as an Associate Member of the School of Computer Science and the Quantitative Life Science program.

My research spans three interconnected themes: statistical machine learning, applications in drug discovery and computational genomics and healthcare. In statistical machine learning, I focus on developing causality-inspired methods, dimensionality reduction, and probabilistic models to address complex high-dimensional data challenges. In drug discovery, my work involves developing machine learning models to accelerate drug candidate identification and enhance the understanding of drug efficacy and safety. In computational genomics and healthcare, I develop techniques to analyze genomic data, identify biomarkers, and explore the genetic basis of diseases, with the goal of improving precision medicine and predicting patient outcomes. My overarching goal is to bridge advanced data-driven methodologies with impactful applications in pharmacology, genomics and healthcare.

For prospective graduate students interested in working with me, please apply to both Mila - Quebec Artificial Intelligence Institute and the Department of Mathematics and Statistics at McGill. Alternatively, applicants may consider co-supervision opportunities with advisors from the computer science program at McGill.

Current Students

PhD - McGill University
Co-supervisor :
PhD - McGill University
PhD - McGill University
Postdoctorate - McGill University

Publications

Structured Learning in Time-dependent Cox Models
Guanbo Wang
Yi Lian
Robert W. Platt
Rui Wang
Sylvie Perreault
Marc Dorais
Mireille E. Schnitzer
Machine Learning Informed Diagnosis for Congenital Heart Disease in Large Claims Data Source
Ariane Marelli
Chao Li
Aihua Liu
Hanh Nguyen
Harry Moroz
James M. Brophy
Liming Guo
Privacy-preserving analysis of time-to-event data under nested case-control sampling
Lamin Juwara
Ana M Velly
Paramita Saha-Chaudhuri
Accelerating Generalized Random Forests with Fixed-Point Trees
David L. Fleischer
David A. Stephens
A Tweedie Compound Poisson Model in Reproducing Kernel Hilbert Space
Yi Lian
Boxiang Wang
Peng Shi
Robert William Platt
Abstract Tweedie models can be used to analyze nonnegative continuous data with a probability mass at zero. There have been wide application… (see more)s in natural science, healthcare research, actuarial science, and other fields. The performance of existing Tweedie models can be limited on today’s complex data problems with challenging characteristics such as nonlinear effects, high-order interactions, high-dimensionality and sparsity. In this article, we propose a kernel Tweedie model, Ktweedie, and its sparse variant, SKtweedie, that can simultaneously address the above challenges. Specifically, nonlinear effects and high-order interactions can be flexibly represented through a wide range of kernel functions, which is fully learned from the data; In addition, while the Ktweedie can handle high-dimensional data, the SKtweedie with integrated variable selection can further improve the interpretability. We perform extensive simulation studies to justify the prediction and variable selection accuracy of our method, and demonstrate the applications in ratemaking and loss-reserving in general insurance. Overall, the Ktweedie and SKtweedie outperform existing Tweedie models when there exist nonlinear effects and high-order interactions, particularly when the dimensionality is high relative to the sample size. The model is implemented in an efficient and user-friendly R package ktweedie (https://cran.r-project.org/package=ktweedie).