Smita Krishnaswamy

In many data-driven applications, higher-order relationships among multiple objects are essential in capturing complex interactions. Hypergr… (see more)aphs, which generalize graphs by allowing edges to connect any number of nodes, provide a flexible and powerful framework for modeling such higher-order relationships. In this work, we introduce hypergraph diffusion wavelets and describe their favorable spectral and spatial properties. We demonstrate their utility for biomedical discovery in spatially resolved transcriptomics by applying the method to represent disease-relevant cellular niches for Alzheimer’s disease.

2024-09-14

ArXiv (preprint)

Hyperedge Representations with Hypergraph Wavelets: Applications to Spatial Transcriptomics

Xingzhi Sun

Charles Xu

João Felipe Rocha

Chen Liu

Benjamin Hollander-Bodie

Laney Goldman

Marcello DiStasio

Michael Perlmutter

In many data-driven applications, higher-order relationships among multiple objects are essential in capturing complex interactions. Hypergr… (see more)aphs, which generalize graphs by allowing edges to connect any number of nodes, provide a flexible and powerful framework for modeling such higher-order relationships. In this work, we introduce hypergraph diffusion wavelets and describe their favorable spectral and spatial properties. We demonstrate their utility for biomedical discovery in spatially resolved transcriptomics by applying the method to represent disease-relevant cellular niches for Alzheimer’s disease.

2024-09-14

ArXiv (preprint)

Geometry-Aware Generative Autoencoders for Metric Learning and Generative Modeling on Data Manifolds

Xingzhi Sun

Danqi Liao

Kincaid MacDonald

Yanlei Zhang

Guillaume Huguet

Guy Wolf

Ian Adelstein

Tim G. J. Rudner

Non-linear dimensionality reduction methods have proven successful at learning low-dimensional representations of high-dimensional point clo… (see more)uds on or near data manifolds. However, existing methods are not easily extensible—that is, for large datasets, it is prohibitively expensive to add new points to these embeddings. As a result, it is very difficult to use existing embeddings generatively, to sample new points on and along these manifolds. In this paper, we propose GAGA (geometry-aware generative autoencoders) a framework which merges the power of generative deep learning with non-linear manifold learning by: 1) learning generalizable geometry-aware neural network embeddings based on non-linear dimensionality reduction methods like PHATE and diffusion maps, 2) deriving a non-euclidean pullback metric on the embedded space to generate points faithfully along manifold geodesics, and 3) learning a flow on the manifold that allows us to transport populations. We provide illustration on easily-interpretable synthetic datasets and showcase results on simulated and real single cell datasets. In particular, we show that the geodesic-based generation can be especially important for scientific datasets where the manifold represents a state space and geodesics can represent dynamics of entities over this space.

2024-06-17

ICML.cc/2024/Workshop/GRaM (published)

openreview.net

Inferring Metabolic States from Single Cell Transcriptomic Data via Geometric Deep Learning

Holly Steach

Siddharth Viswanath

Yixuan He

Xitong Zhang

Natalia Ivanova

Matthew Hirn

Michael Perlmutter

2024-05-17

Lecture Notes in Computer Science (published)

Supervised latent factor modeling isolates cell-type-specific transcriptomic modules that underlie Alzheimer’s disease progression

Liam Hodgson

Yue Li

Yasser Iturria-Medina

Jo Anne Stratton

Guy Wolf

David A. Bennett

Danilo Bzdok

2024-05-17

Communications Biology (published)

Novel cell states arise in embryonic cells devoid of key reprogramming factors

Scott E. Youlten

Liyun Miao

Caroline Hoppe

Curtis W. Boswell

Damir Musaev

Mario Abdelmessih

Valerie A. Tornini

Antonio J. Giraldez

The capacity for embryonic cells to differentiate relies on a large-scale reprogramming of the oocyte and sperm nucleus into a transient tot… (see more)ipotent state. In zebrafish, this reprogramming step is achieved by the pioneer factors Nanog, Pou5f3, and Sox19b (NPS). Yet, it remains unclear whether cells lacking this reprogramming step are directed towards wild type states or towards novel developmental canals in the Waddington landscape of embryonic development. Here we investigate the developmental fate of embryonic cells mutant for NPS by analyzing their single-cell gene expression profiles. We find that cells lacking the first developmental reprogramming steps can acquire distinct cell states. These states are manifested by gene expression modules that result from a failure of nuclear reprogramming, the persistence of the maternal program, and the activation of somatic compensatory programs. As a result, most mutant cells follow new developmental canals and acquire new mixed cell states in development. In contrast, a group of mutant cells acquire primordial germ cell-like states, suggesting that NPS-dependent reprogramming is dispensable for these cell states. Together, these results demonstrate that developmental reprogramming after fertilization is required to differentiate most canonical developmental programs, and loss of the transient totipotent state canalizes embryonic cells into new developmental states in vivo.

2024-05-15

bioRxiv (preprint)

AAnet resolves a continuum of spatially-localized cell states to unveil tumor complexity

Aarthi Venkat

Scott E. Youlten

Beatriz P. San Juan

Carley Purcell

Matthew Amodio

Daniel B. Burkhardt

Andrew Benz

Jeff Holst

Cerys McCool

Annelie Mollbrink

Joakim Lundeberg

David van Dijk

Leonard D. Goldstein

Sarah Kummerfeld

Christine L. Chaffer

Identifying functionally important cell states and structure within a heterogeneous tumor remains a significant biological and computational… (see more) challenge. Moreover, current clustering or trajectory-based computational models are ill-equipped to address the notion that cancer cells reside along a phenotypic continuum. To address this, we present Archetypal Analysis network (AAnet), a neural network that learns key archetypal cell states within a phenotypic continuum of cell states in single-cell data. Applied to single-cell RNA sequencing data from pre-clinical models and a cohort of 34 clinical breast cancers, AAnet identifies archetypes that resolve distinct biological cell states and processes, including cell proliferation, hypoxia, metabolism and immune interactions. Notably, archetypes identified in primary tumors are recapitulated in matched liver, lung and lymph node metastases, demonstrating that a significant component of intratumoral heterogeneity is driven by cell intrinsic properties. Using spatial transcriptomics as orthogonal validation, AAnet-derived archetypes show discrete spatial organization within tumors, supporting their distinct archetypal biology. We further reveal that ligand:receptor cross-talk between cancer and adjacent stromal cells contributes to intra-archetypal biological mimicry. Finally, we use AAnet archetype identifiers to validate GLUT3 as a critical mediator of a hypoxic cell archetype harboring a cancer stem cell population, which we validate in human triple-negative breast cancer specimens. AAnet is a powerful tool to reveal functional cell states within complex samples from multimodal single-cell data.

2024-05-14

bioRxiv (preprint)

BLIS-Net: Classifying and Analyzing Signals on Graphs

Charles Xu

Laney Goldman

Valentina Guo

Benjamin Hollander-Bodie

Maedee Trank-Greene

Ian Adelstein

Edward De Brouwer

Rex Ying

Michael Perlmutter

Graph neural networks (GNNs) have emerged as a powerful tool for tasks such as node classification and graph classification. However, much l… (see more)ess work has been done on signal classification, where the data consists of many functions (referred to as signals) defined on the vertices of a single graph. These tasks require networks designed differently from those designed for traditional GNN tasks. Indeed, traditional GNNs rely on localized low-pass filters, and signals of interest may have intricate multi-frequency behavior and exhibit long range interactions. This motivates us to introduce the BLIS-Net (Bi-Lipschitz Scattering Net), a novel GNN that builds on the previously introduced geometric scattering transform. Our network is able to capture both local and global signal structure and is able to capture both low-frequency and high-frequency information. We make several crucial changes to the original geometric scattering architecture which we prove increase the ability of our network to capture information about the input signal and show that BLIS-Net achieves superior performance on both synthetic and real-world data sets based on traffic flow and fMRI data.

2024-04-18

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (published)

Directed Scattering for Knowledge Graph-Based Cellular Signaling Analysis

Aarthi Venkat

Joyce Chew

Ferran Cardoso Rodriguez

Christopher J. Tape

Michael Perlmutter

Directed graphs are a natural model for many phenomena, in particular scientific knowledge graphs such as molecular interaction or chemical … (see more)reaction networks that define cellular signaling relationships. In these situations, source nodes typically have distinct biophysical properties from sinks. Due to their ordered and unidirectional relationships, many such networks also have hierarchical and multiscale structure. However, the majority of methods performing node- and edge-level tasks in machine learning do not take these properties into account, and thus have not been leveraged effectively for scientific tasks such as cellular signaling network inference. We propose a new framework called Directed Scattering Autoencoder (DSAE) which uses a directed version of a geometric scattering transform, combined with the non-linear dimensionality reduction properties of an autoencoder and the geometric properties of the hyperbolic space to learn latent hierarchies. We show this method outperforms numerous others on tasks such as embedding directed graphs and learning cellular signaling networks.

2024-04-14

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (published)

Bayesian Spectral Graph Denoising with Smoothness Prior

Samuel Leone

Xingzhi Sun

Michael Perlmutter

Here we consider the problem of denoising features associated to complex data, modeled as signals on a graph, via a smoothness prior. This i… (see more)s motivated in part by settings such as single-cell RNA where the data is very high-dimensional, but its structure can be captured via an affinity graph. This allows us to utilize ideas from graph signal processing. In particular, we present algorithms for the cases where the signal is perturbed by Gaussian noise, dropout, and uniformly distributed noise. The signals are assumed to follow a prior distribution defined in the frequency domain which favors signals which are smooth across the edges of the graph. By pairing this prior distribution with our three models of noise generation, we propose Maximum A Posteriori (M.A.P.) estimates of the true signal in the presence of noisy data and provide algorithms for computing the M.A.P. Finally, we demonstrate the algorithms’ ability to effectively restore signals from white noise on image data and from severe dropout in single-cell RNA sequence data.

2024-03-13

Annual Conference on Information Sciences and Systems (published)

Abstract B049: Pancreatic beta cell stress pathways drive pancreatic ductal adenocarcinoma development in obesity

Cathy C. Garcia

Aarthi Venkat

Alex Tong

Sherry Agabiti

Lauren Lawres

Rebecca Cardone

Richard G. Kibbey

Mandar Deepak Muzumdar

2024-01-16

Cancer Research (published)

Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy

Danqi Liao

Chen Liu

Benjamin W Christensen

Alexander Tong

Guillaume Huguet

Guy Wolf

Maximilian Nickel

Ian Adelstein

Entropy and mutual information in neural networks provide rich information on the learning process, but they have proven difficult to comput… (see more)e reliably in high dimensions. Indeed, in noisy and high-dimensional data, traditional estimates in ambient dimensions approach a fixed entropy and are prohibitively hard to compute. To address these issues, we leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. Specifically, we define diffusion spectral entropy (DSE) in neural representations of a dataset as well as diffusion spectral mutual information (DSMI) between different variables representing data. First, we show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data that outperform classic Shannon entropy, nonparametric estimation, and mutual information neural estimation (MINE). We then study the evolution of representations in classification networks with supervised learning, self-supervision, or overfitting. We observe that (1) DSE of neural representations increases during training; (2) DSMI with the class label increases during generalizable learning but stays stagnant during overfitting; (3) DSMI with the input signal shows differing trends: on MNIST it increases, while on CIFAR-10 and STL-10 it decreases. Finally, we show that DSE can be used to guide better network initialization and that DSMI can be used to predict downstream classification accuracy across 962 models on ImageNet.

2024-01-01

CISS (published)