Portrait of Yannan Shen is unavailable

Yannan Shen

Alumni

Publications

A three-state coupled Markov switching model for COVID-19 outbreaks across Quebec based on hospital admissions (preprint)
Dirk Douwes-Schultz
Alexandra M. Schmidt
David L Buckeridge
Development of a Framework for Establishing 'Gold Standard' Outbreak Data from Submitted SARS-CoV-2 Genome Samples
Russell Steele
Philip Abdelmalik
David L Buckeridge
Submitted genomic data for respiratory viruses reflect the emergence and spread of new variants. Although delays in submission limit the uti… (see more)lity of these data for prospective surveillance, they may be useful for evaluating other surveillance sources. However, few studies have investigated the use of these data for evaluating aberration detection in surveillance systems. Our study used a Bayesian online change point detection algorithm (BOCP) to detect increases in the number of submitted genome samples as a means of establishing 'gold standard' dates of outbreak onset in multiple countries. We compared models using different data transformations and parameter values. BOCP detected change points that were not sensitive to different parameter settings. We also found data transformations were essential prior to change point detection. Our study presents a framework for using global genomic submission data to develop 'gold standard' dates about the onset of outbreaks due to new viral variants.
BAND: Biomedical Alert News Dataset
Zihao Fu
Meiru Zhang
Zaiqiao Meng
Anya Okhmatovskaia
David L Buckeridge
Nigel Collier
BioCaster in 2021: automatic disease outbreaks detection from global news media
Zaiqiao Meng
Anya Okhmatovskaia
Maxime Polleri
Guido Powell
Zihao Fu
Iris Ganser
Meiru Zhang
Nicholas B. King
Nigel Collier
SUMMARY: BioCaster was launched in 2008 to provide an ontology-based text mining system for early disease detection from open news sources. … (see more)Following a 6-year break, we have re-launched the system in 2021. Our goal is to systematically upgrade the methodology using state-of-the-art neural network language models, whilst retaining the original benefits that the system provided in terms of logical reasoning and automated early detection of infectious disease outbreaks. Here, we present recent extensions such as neural machine translation in 10 languages, neural classification of disease outbreak reports and a new cloud-based visualization dashboard. Furthermore, we discuss our vision for further improvements, including combining risk assessment with event semantics and assessing the risk of outbreaks with multi-granularity. We hope that these efforts will benefit the global public health community. AVAILABILITY AND IMPLEMENTATION: BioCaster web-portal is freely accessible at http://biocaster.org.
A Conceptual Framework for Representing Events Under Public Health Surveillance.
Anya Okhmatovskaia
Iris Ganser
Nigel Collier
Nicholas B. King
Zaiqiao Meng
David L. Buckeridge
Information integration across multiple event-based surveillance (EBS) systems has been shown to improve global disease surveillance in expe… (see more)rimental settings. In practice, however, integration does not occur due to the lack of a common conceptual framework for encoding data within EBS systems. We aim to address this gap by proposing a candidate conceptual framework for representing events and related concepts in the domain of public health surveillance.
Monitoring non-pharmaceutical public health interventions during the COVID-19 pandemic
Guido Powell
Iris Ganser
Qulu Zheng
Chris Grundy
Anya Okhmatovskaia
David L. Buckeridge
Measuring and monitoring non-pharmaceutical interventions is important yet challenging due to the need to clearly define and encode non-phar… (see more)maceutical interventions, to collect geographically and socially representative data, and to accurately document the timing at which interventions are initiated and changed. These challenges highlight the importance of integrating and triangulating across multiple databases and the need to expand and fund the mandate for public health organizations to track interventions systematically.
The role of case importation in explaining differences in early SARS-CoV-2 transmission dynamics in Canada—A mathematical modeling study of surveillance data
Arnaud Godin
Yiqing Xia
David L Buckeridge
Sharmistha Mishra
Dirk Douwes-Schultz
Maxime Lavigne
Mélanie Drolet
Alexandra M Schmidt
Marc Brisson
Mathieu Maheu-Giroux
Global Surveillance of COVID-19 by mining news media using a multi-source dynamic embedded topic model.
Zhi Wen
Imane Chafi
Anya Okhmatovskaia
Guido Powell
David L. Buckeridge
As the COVID-19 pandemic continues to unfold, understanding the global impact of non-pharmacological interventions (NPI) is important for fo… (see more)rmulating effective intervention strategies, particularly as many countries prepare for future waves. We used a machine learning approach to distill latent topics related to NPI from large-scale international news media. We hypothesize that these topics are informative about the timing and nature of implemented NPI, dependent on the source of the information (e.g., local news versus official government announcements) and the target countries. Given a set of latent topics associated with NPI (e.g., self-quarantine, social distancing, online education, etc), we assume that countries and media sources have different prior distributions over these topics, which are sampled to generate the news articles. To model the source-specific topic priors, we developed a semi-supervised, multi-source, dynamic, embedded topic model. Our model is able to simultaneously infer latent topics and learn a linear classifier to predict NPI labels using the topic mixtures as input for each news article. To learn these models, we developed an efficient end-to-end amortized variational inference algorithm. We applied our models to news data collected and labelled by the World Health Organization (WHO) and the Global Public Health Intelligence Network (GPHIN). Through comprehensive experiments, we observed superior topic quality and intervention prediction accuracy, compared to the baseline embedded topic models, which ignore information on media source and intervention labels. The inferred latent topics reveal distinct policies and media framing in different countries and media sources, and also characterize reaction to COVID-19 and NPI in a semantically meaningful manner. Our PyTorch code is available on Github (htps://github.com/li-lab-mcgill/covid19_media).