Portrait of Jessie Huang is unavailable

Jessie Huang

Alumni

Publications

Single-cell analysis reveals inflammatory interactions driving macular degeneration
Manik Kuchroo
Marcello DiStasio
Eric Song
Eda Calapkulu
Maryam Ige
Amar H. Sheth
Abdelilah Majdoubi
Madhvi Menon
Abhinav Godavarthi
Yu Xing
Scott Gigante
Holly Steach
Janhavi Narain
Kisung You
George Mourgkos … (see 6 more)
Rahul M. Dhodapkar
Matthew J. Hirn
Bastian Rieck
Brian P. Hafler
Due to commonalities in pathophysiology, age-related macular degeneration (AMD) represents a uniquely accessible model to investigate thera… (see more)pies for neurodegenerative diseases, leading us to examine whether pathways of disease progression are shared across neurodegenerative conditions. Here we use single-nucleus RNA sequencing to profile lesions from 11 postmortem human retinas with age-related macular degeneration and 6 control retinas with no history of retinal disease. We create a machine-learning pipeline based on recent advances in data geometry and topology and identify activated glial populations enriched in the early phase of disease. Examining single-cell data from Alzheimer’s disease and progressive multiple sclerosis with our pipeline, we find a similar glial activation profile enriched in the early phase of these neurodegenerative diseases. In late-stage age-related macular degeneration, we identify a microglia-to-astrocyte signaling axis mediated by interleukin-1β which drives angiogenesis characteristic of disease pathogenesis. We validated this mechanism using in vitro and in vivo assays in mouse, identifying a possible new therapeutic target for AMD and possibly other neurodegenerative conditions. Thus, due to shared glial states, the retina provides a potential system for investigating therapeutic approaches in neurodegenerative diseases.
Multi-view manifold learning of human brain-state trajectories.
Erica L. Busch
Andrew Benz
Tom Wallenstein
Nicholas B. Turk-Browne
The complexity of the human brain gives the illusion that brain activity is intrinsically high-dimensional. Nonlinear dimensionality-reducti… (see more)on methods such as uniform manifold approximation and t-distributed stochastic neighbor embedding have been used for high-throughput biomedical data. However, they have not been used extensively for brain activity data such as those from functional magnetic resonance imaging (fMRI), primarily due to their inability to maintain dynamic structure. Here we introduce a nonlinear manifold learning method for time-series data—including those from fMRI—called temporal potential of heat-diffusion for affinity-based transition embedding (T-PHATE). In addition to recovering a low-dimensional intrinsic manifold geometry from time-series data, T-PHATE exploits the data’s autocorrelative structure to faithfully denoise and unveil dynamic trajectories. We empirically validate T-PHATE on three fMRI datasets, showing that it greatly improves data visualization, classification, and segmentation of the data relative to several other state-of-the-art dimensionality-reduction benchmarks. These improvements suggest many potential applications of T-PHATE to other high-dimensional datasets of temporally diffuse processes.
Learning Shared Neural Manifolds from Multi-Subject fMRI Data
Erica L. Busch
Tom Wallenstein
Michal Gerasimiuk
Andrew Benz
Nicholas B. Turk-Browne
Functional magnetic resonance imaging (fMRI) is a notoriously noisy measurement of brain activity because of the large variations between in… (see more)dividuals, signals marred by environmental differences during collection, and spatiotemporal averaging required by the measurement resolution. In addition, the data is extremely high dimensional, with the space of the activity typically having much lower intrinsic dimension. In order to understand the connection between stimuli of interest and brain activity, and analyze differences and commonalities between subjects, it becomes important to learn a meaningful embedding of the data that denoises, and reveals its intrinsic structure. Specifically, we assume that while noise varies significantly between individuals, true responses to stimuli will share common, low-dimensional features between subjects which are jointly discoverable. Similar approaches have been exploited previously but they have mainly used linear methods such as PCA and shared response modeling (SRM). In contrast, we propose a neural network called MRMD-AE (manifold-regularized multiple decoder, autoencoder), that learns a common embedding from multiple subjects in an experiment while retaining the ability to decode to individual raw fMRI signals. We show that our learned common space represents an extensible manifold (where new points not seen during training can be mapped), improves the classification accuracy of stimulus features of unseen timepoints, as well as improves cross-subject translation of fMRI signals. We believe this framework can be used for many downstream applications such as guided brain-computer interface (BCI) training in the future.
Population Genomics Approaches for Genetic Characterization of SARS-CoV-2 Lineages
Isabel Gamache
Arnaud N'Guessan
Justin Pelletier
Carmen Lia Murall
Vanda Gaonac’h-Lovejoy
David J. Hamelin
Raphaël Poujol
Jean-Christophe Grenier
Martin Smith
Etienne Caron
Morgan Craig
B. Jesse Shapiro
Julie G. Hussin
The genome of the Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2), the pathogen that causes coronavirus disease 2019 (COVID-19)… (see more), has been sequenced at an unprecedented scale leading to a tremendous amount of viral genome sequencing data. To assist in tracing infection pathways and design preventive strategies, a deep understanding of the viral genetic diversity landscape is needed. We present here a set of genomic surveillance tools from population genetics which can be used to better understand the evolution of this virus in humans. To illustrate the utility of this toolbox, we detail an in depth analysis of the genetic diversity of SARS-CoV-2 in first year of the COVID-19 pandemic. We analyzed 329,854 high-quality consensus sequences published in the GISAID database during the pre-vaccination phase. We demonstrate that, compared to standard phylogenetic approaches, haplotype networks can be computed efficiently on much larger datasets. This approach enables real-time lineage identification, a clear description of the relationship between variants of concern, and efficient detection of recurrent mutations. Furthermore, time series change of Tajima's D by haplotype provides a powerful metric of lineage expansion. Finally, principal component analysis (PCA) highlights key steps in variant emergence and facilitates the visualization of genomic variation in the context of SARS-CoV-2 diversity. The computational framework presented here is simple to implement and insightful for real-time genomic surveillance of SARS-CoV-2 and could be applied to any pathogen that threatens the health of populations of humans and other organisms.
Data-driven approaches for genetic characterization of SARS-CoV-2 lineages
Isabel Gamache
Arnaud N’Guessan
Justin Pelletier
Carmen Lia Murall
Raphaël Poujol
Jean-Christophe Grenier
Martin Smith
Etienne Caron
Morgan Craig
Jesse Shapiro
Julie G. Hussin
The genome of the Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2), the pathogen that causes coronavirus disease 2019 (COVID-19)… (see more), has been sequenced at an unprecedented scale, leading to a tremendous amount of viral genome sequencing data. To understand the evolution of this virus in humans, and to assist in tracing infection pathways and designing preventive strategies, we present a set of computational tools that span phylogenomics, population genetics and machine learning approaches. To illustrate the utility of this toolbox, we detail an in depth analysis of the genetic diversity of SARS-CoV-2 in first year of the COVID-19 pandemic, using 329,854 high-quality consensus sequences published in the GISAID database during the pre-vaccination phase. We demonstrate that, compared to standard phylogenetic approaches, haplotype networks can be computed efficiently on much larger datasets, enabling real-time analyses. Furthermore, time series change of Tajima’s D provides a powerful metric of population expansion. Unsupervised learning techniques further highlight key steps in variant detection and facilitate the study of the role of this genomic variation in the context of SARS-CoV-2 infection, with Multiscale PHATE methodology identifying fine-scale structure in the SARS-CoV-2 genetic data that underlies the emergence of key lineages. The computational framework presented here is useful for real-time genomic surveillance of SARS-CoV-2 and could be applied to any pathogen that threatens the health of worldwide populations of humans and other organisms.
Multiscale PHATE Exploration of SARS-CoV-2 Data Reveals Multimodal Signatures of Disease
Manik Kuchroo
Patrick Wong
Jean-Christophe Grenier
Dennis Shung
Carolina Lucas
Jon Klein
Daniel B. Burkhardt
Scott Gigante
Abhinav Godavarthi
Benjamin Israelow
Tianyang Mao
Ji Eun Oh
Julio Silva
Takehiro Takahashi
Camila D. Odio
Arnau Casanovas-Massana
John Fournier
Shelli Farhadian … (see 7 more)
Charles S. Dela Cruz
Albert I. Ko
F. Perry Wilson
Akiko Iwasaki
Abstract

The biomedical community is producing increasingly high dimensional datasets, integrated from hundreds of… (see more) patient samples, which current computational techniques struggle to explore. To uncover biological meaning from these complex datasets, we present an approach called Multiscale PHATE, which learns abstracted biological features from data that can be directly predictive of disease. Built on a coarse graining process called diffusion condensation, Multiscale PHATE learns a data topology that can be analyzed at coarse levels for high level summarizations of data, as well as at fine levels for detailed representations on subsets. We apply Multiscale PHATE to study the immune response to COVID-19 in 54 million cells from 168 hospitalized patients. Through our analysis of patient samples, we identify CD16-hi,CD66b-lo neutrophil and IFNγ+,GranzymeB+ Th17 cell responses enriched in patients who die. Furthermore, we show that population groupings Multiscale PHATE discovers can be directly fed into a classifier to predict disease outcome. We also use Multiscale PHATE-derived features to construct two different manifolds of patients, one from abstracted flow cytometry features and another directly on patient clinical features, both associating immune subsets and clinical markers with outcome.

Learning Safe Policies with Expert Guidance
Je-chun Huang
Fa Wu
Yang Cai
We propose a framework for ensuring safe behavior of a reinforcement learning agent when the reward function may be difficult to specify. In… (see more) order to do this, we rely on the existence of demonstrations from expert policies, and we provide a theoretical framework for the agent to optimize in the space of rewards consistent with its existing knowledge. We propose two methods to solve the resulting optimization: an exact ellipsoid-based method and a method in the spirit of the "follow-the-perturbed-leader" algorithm. Our experiments demonstrate the behavior of our algorithm in both discrete and continuous problems. The trained agent safely avoids states with potential negative effects while imitating the behavior of the expert in the other states.