Portrait of David Buckeridge

David Buckeridge

Associate Academic Member
Full Professor, McGill University, Department of Epidemiology, Biostatistics and Occupational Health
Research Topics
Medical Machine Learning

Biography

David Buckeridge is a professor at the School of Population and Global Health at McGill University, as well as chief digital health officer for the McGill University Health Centre and executive scientific director of the Public Health Agency of Canada.

A Tier 1 Canada Research Chair in Health Informatics and Data Science, Buckeridge has projected health system demand for the Canadian province of Quebec, led data management and analytics for the Canadian Immunity Task Force, and supported the World Health Organization in monitoring global immunity to SARS-CoV-2. He has an MD from Queen's University, an MSc in epidemiology from the University of Toronto and a PhD in biomedical informatics from Stanford University. He is a Fellow of the Royal College of Physicians of Canada.

Current Students

Master's Research - McGill University
PhD - McGill University
Master's Research - McGill University
Master's Research - McGill University
Master's Research - McGill University

Publications

The impact of statistical adjustment for assay performance on inferences from SARS-CoV-2 serological surveillance studies
Jiacheng Chen
Yuan Yu
Sheila F O’Brien
Carmen Charlton
Steven J. Drews
Jane M Heffernan
Amber M Smith
Y. Nakagama
Yasutoshi Kido
W Alton Russell
Choice of immunoassay influences population seroprevalence estimates. Post-hoc adjustments for assay performance could improve comparability… (see more) of estimates across studies and enable pooled analyses. We assessed post-hoc adjustment methods using data from 2021–2023 SARS-CoV-2 serosurveillance studies in Alberta, Canada: one that tested 124,008 blood donations using Roche immunoassays (SARS-CoV-2 nucleocapsid total antibody and anti-SARS-CoV-2 S) and another that tested 214,780 patient samples using Abbott immunoassays (SARS-CoV-2 IgG and anti-SARS-CoV-2 S). Comparing datasets, seropositivity for antibodies against nucleocapsid (anti-N) diverged after May 2022 due to differential loss of sensitivity as a function of time since infection. The commonly used Rogen-Gladen adjustment did not reduce this divergence. Regression-based adjustments using the assays’ semi-quantitative results produced more similar estimates of anti-N seroprevalence and rolling incidence proportion (proportion of individuals infected in recent months). Seropositivity for antibodies targeting SARS-CoV-2 spike protein was similar without adjustment, and concordance was not improved when applying an alternative, functional threshold. These findings suggest that assay performance substantially impacted population inferences from SARS-CoV-2 serosurveillance studies in the Omicron period. Unlike methods that ignore time-varying assay sensitivity, regression-based methods using the semi-quantitative assay resulted in increased concordance in estimated anti-N seropositivity and rolling incidence between cohorts using different assays.
Sociodemographic characteristics of SARS-CoV-2 serosurveillance studies with diverse recruitment strategies, Canada, 2020 to 2023
Matthew J Knight
Yuan Yu
Jiacheng Chen
Sheila F O’Brien
Carmen Charlton
W Alton Russell
FedWeight: mitigating covariate shift of federated learning on electronic health records data through patients re-weighting
Mike He Zhu
Na Li
Xiaoxiao Li
Dianbo Liu
A three-state coupled Markov switching model for COVID-19 outbreaks across Quebec based on hospital admissions
Dirk Douwes-Schultz
Alexandra M. Schmidt
Characterizing co-purchased food products with soda, fresh fruits, and fresh vegetables using loyalty card purchasing data in Montréal, Canada, 2015–2017
Hiroshi Mamiya
Kody Crowell
Catherine L. Mah
Amélie Quesnel-Vallée
Aman Verma
Sociodemographic characteristics of SARS-CoV-2 serosurveillance studies with diverse recruitment strategies, Canada, 2020 to 2023
Matthew J Knight
Yuan Yu
Jiacheng Chen
Sheila F O’Brien
Carmen Charlton
W Alton Russell
Background. Serological testing was a key component of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) surveillance. Social dis… (see more)tancing interventions, resource limitations, and the need for timely data led to serosurveillance studies using a range of recruitment strategies, which likely influenced study representativeness. Characterizing representativeness in surveillance is crucial to identify gaps in sampling coverage and to assess health inequities. Methods. We retrospectively analyzed three pre-existing longitudinal cohorts, two convenience samples using residual blood, and one de novo probabilistic survey conducted in Canada between April 2020 - November 2023. We calculated study specimen counts by age, sex, urbanicity, race/ethnicity, and neighborhood deprivation quintiles. We derived a 'representation ratio' as a simple metric to assess generalizability to a target population and various sociodemographic strata. Results. The six studies included 1,321,675 specimens. When stratifying by age group and sex, 65% of racialized minority subgroups were moderately underrepresented (representation ratio 0.75). Representation was generally higher for older Canadians, urban neighborhoods, and neighborhoods with low material deprivation. Rural representation was highest in a study that used outpatient laboratory blood specimens. Racialized minority representation was highest in a de novo probabilistic survey cohort. Conclusion. While no study had adequate representation of all subgroups, less traditional recruitment strategies were more representative of some population dimensions. Understanding demographic representativeness and barriers to recruitment are important considerations when designing population health surveillance studies.
Extrapolatable Transformer Pre-training for Ultra Long Time-Series Forecasting
Qincheng Lu
Hao Xu
Mike He Zhu
MixEHR-Nest: Identifying Subphenotypes within Electronic Health Records through Hierarchical Guided-Topic Modeling
Ruohan Wang
Zilong Wang
Automatic subphenotyping from electronic health records (EHRs)provides numerous opportunities to understand diseases with unique subgroups a… (see more)nd enhance personalized medicine for patients. However, existing machine learning algorithms either focus on specific diseases for better interpretability or produce coarse-grained phenotype topics without considering nuanced disease patterns. In this study, we propose a guided topic model, MixEHR-Nest, to infer sub-phenotype topics from thousands of disease using multi-modal EHR data. Specifically, MixEHR-Nest detects multiple subtopics from each phenotype topic, whose prior is guided by the expert-curated phenotype concepts such as Phenotype Codes (PheCodes) or Clinical Classification Software (CCS) codes. We evaluated MixEHR-Nest on two EHR datasets: (1) the MIMIC-III dataset consisting of over 38 thousand patients from intensive care unit (ICU) from Beth Israel Deaconess Medical Center (BIDMC) in Boston, USA; (2) the healthcare administrative database PopHR, comprising 1.3 million patients from Montreal, Canada. Experimental results demonstrate that MixEHR-Nest can identify subphenotypes with distinct patterns within each phenotype, which are predictive for disease progression and severity. Consequently, MixEHR-Nest distinguishes between type 1 and type 2 diabetes by inferring subphenotypes using CCS codes, which do not differentiate these two subtype concepts. Additionally, MixEHR-Nest not only improved the prediction accuracy of short-term mortality of ICU patients and initial insulin treatment in diabetic patients but also revealed the contributions of subphenotypes. For longitudinal analysis, MixEHR-Nest identified subphenotypes of distinct age prevalence under the same phenotypes, such as asthma, leukemia, epilepsy, and depression. The MixEHR-Nest software is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-Nest.
Bidirectional Generative Pre-training for Improving Healthcare Time-series Representation Learning
Learning time-series representations for discriminative tasks, such as classification and regression, has been a long-standing challenge in … (see more)the healthcare domain. Current pre-training methods are limited in either unidirectional next-token prediction or randomly masked token prediction. We propose a novel architecture called Bidirectional Timely Generative Pre-trained Transformer (BiTimelyGPT), which pre-trains on biosignals and longitudinal clinical records by both next-token and previous-token prediction in alternating transformer layers. This pre-training task preserves original distribution and data shapes of the time-series. Additionally, the full-rank forward and backward attention matrices exhibit more expressive representation capabilities. Using biosignals and longitudinal clinical records, BiTimelyGPT demonstrates superior performance in predicting neurological functionality, disease diagnosis, and physiological signs. By visualizing the attention heatmap, we observe that the pre-trained BiTimelyGPT can identify discriminative segments from biosignal time-series sequences, even more so after fine-tuning on the task.
Bidirectional Generative Pre-training for Improving Healthcare Time-series Representation Learning
Qincheng Lu
Mike He Zhu
Evaluating the effectiveness of the Smart About Meds (SAM) mobile application among patients discharged from hospital: protocol of a randomised controlled trial
Robyn Tamblyn
Bettina Habib
Daniala L Weir
Elizaveta Frolova
Rolan Alattar
Jessica Rogozinsky
Caroline Beauchamp
Rosalba Pupo
Susan J Bartlett
Emily McDonald
Comparative evaluation of methodologies for estimating the effectiveness of non-pharmaceutical interventions in the context of COVID-19: a simulation study
Iris Ganser
Juliette Paireau
Simon Cauchemez
Rodolphe Thiébaut
M. Prague