Smita Krishnaswamy

Github

Google Scholar

Publications

InfoGain Wavelets: Furthering the Design of Diffusion Wavelets for Graph-Structured Data

David R. Johnson

Michael Perlmutter

2025-04-08

ArXiv (preprint)

Latent Representation Learning for Multimodal Brain Activity Translation

Arman Afrasiyabi

Dhananjay Bhaskar

Erica Lindsey Busch

Laurent Caplette

Rahul Singh

Guillaume Lajoie

Nicholas B Turk-Browne

Neuroscience employs diverse neuroimaging techniques, each offering distinct insights into brain activity, from electrophysiological recordi… (see more)ngs such as EEG, which have high temporal resolution, to hemodynamic modalities such as fMRI, which have increased spatial precision. However, integrating these heterogeneous data sources remains a challenge, which limits a comprehensive understanding of brain function. We present the Spatiotemporal Alignment of Multimodal Brain Activity (SAMBA) framework, which bridges the spatial and temporal resolution gaps across modalities by learning a unified latent space free of modality-specific biases. SAMBA introduces a novel attention-based wavelet decomposition for spectral filtering of electrophysiological recordings, graph attention networks to model functional connectivity between functional brain units, and recurrent layers to capture temporal autocorrelations in brain signal. We show that the training of SAMBA, aside from achieving translation, also learns a rich representation of brain information processing. We showcase this classify external stimuli driving brain activity from the representation learned in hidden layers of SAMBA, paving the way for broad downstream applications in neuroscience research and clinical contexts.

2025-04-06

ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (published)

Accelerated learning of a noninvasive human brain-computer interface via manifold geometry

Erica Lindsey Busch

E. Chandra Fincke

Guillaume Lajoie

Nicholas B Turk-Browne

2025-04-03

bioRxiv (preprint)

InfoGain Wavelets: Furthering the Design of Diffusion Wavelets for Graph-Structured Data

David R. Johnson

Michael Perlmutter

Diffusion wavelets extract information from graph signals at different scales of resolution by utilizing graph diffusion operators raised to… (see more) various powers, known as diffusion scales. Traditionally, the diffusion scales are chosen to be dyadic integers,

2025-04-01

arXiv (published)

HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell Data

Siddharth Viswanath

Hiren Madhu

Dhananjay Bhaskar

Jake Kovalic

David R. Johnson

Christopher Tape

Ian Adelstein

Rex Ying

Michael Perlmutter

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning o… (see more)n high-dimensional point clouds. Our work is motivated by single-cell data which can have very high-dimensionality --exceeding the capabilities of existing methods for point clouds which are mostly tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e., one data set for every patient), necessitating models that can process large, high-dimensional point-clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric and topological information. In contrast, HiPoNet models the point-cloud as a set of higher-order simplicial complexes, with each particular complex being created using a reweighting of features. This method thus generates multiple constructs corresponding to different views of high-dimensional data, which in biology offers the possibility of disentangling distinct cellular processes. It then employs simplicial wavelet transforms to extract multiscale features, capturing both local and global topology from each view. We show that geometric and topological information is preserved in this framework both theoretically and empirically. We showcase the utility of HiPoNet on point-cloud level tasks, involving classification and regression of entire point-clouds in data cohorts. Experimentally, we find that HiPoNet outperforms other point-cloud and graph-based models on single-cell data. We also apply HiPoNet to spatial transcriptomics datasets using spatial coordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.

2025-02-11

ArXiv (preprint)

HiPoNet: A Topology-Preserving Multi-View Neural Network For High Dimensional Point Cloud and Single-Cell Data

Siddharth Viswanath

Hiren Madhu

Dhananjay Bhaskar

Jake Kovalic

David R. Johnson

Rex Ying

Christopher Tape

Ian Adelstein

Michael Perlmutter

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning o… (see more)n high-dimensional point clouds. Single-cell data can have high dimensionality exceeding the capabilities of existing methods point cloud tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e. one on every patient), necessitating models that can process large, high-dimensional point clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric information. In contrast, HiPoNet forms higher-order simplicial complexes through learnable feature reweighting, generating multiple data views that disentangle distinct biological processes. It then employs simplicial wavelet transforms to extract multi-scale features - capturing both local and global topology. We empirically show that these components preserve topological information in the learned representations, and that HiPoNet significantly outperforms state-of-the-art point-cloud and graph-based models on single cell. We also show an application of HiPoNet on spatial transcriptomics datasets using spatial co-ordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.

2025-02-11

ArXiv (preprint)

HiPoNet: A Topology-Preserving Multi-View Neural Network For High Dimensional Point Cloud and Single-Cell Data

Siddharth Viswanath

Hiren Madhu

Dhananjay Bhaskar

Jake Kovalic

Dave Johnson

Rex Ying

Christopher Tape

Ian Adelstein

Michael Perlmutter

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning o… (see more)n high-dimensional point clouds. Single-cell data can have high dimensionality exceeding the capabilities of existing methods point cloud tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e. one on every patient), necessitating models that can process large, high-dimensional point clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric information. In contrast, HiPoNet forms higher-order simplicial complexes through learnable feature reweighting, generating multiple data views that disentangle distinct biological processes. It then employs simplicial wavelet transforms to extract multi-scale features - capturing both local and global topology. We empirically show that these components preserve topological information in the learned representations, and that HiPoNet significantly outperforms state-of-the-art point-cloud and graph-based models on single cell. We also show an application of HiPoNet on spatial transcriptomics datasets using spatial co-ordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.

2025-02-11

ArXiv (preprint)

Principal Curvatures Estimation with Applications to Single Cell Data

Yanlei Zhang

Lydia Mezrag

Xingzhi Sun

Charles Xu

Kincaid MacDonald

Dhananjay Bhaskar

Bastian Rieck

The rapidly growing field of single-cell transcriptomic sequencing (scRNAseq) presents challenges for data analysis due to its massive datas… (see more)ets. A common method in manifold learning consists in hypothesizing that datasets lie on a lower dimensional manifold. This allows to study the geometry of point clouds by extracting meaningful descriptors like curvature. In this work, we will present Adaptive Local PCA (AdaL-PCA), a data-driven method for accurately estimating various notions of intrinsic curvature on data manifolds, in particular principal curvatures for surfaces. The model relies on local PCA to estimate the tangent spaces. The evaluation of AdaL-PCA on sampled surfaces shows state-of-the-art results. Combined with a PHATE embedding, the model applied to single-cell RNA sequencing data allows us to identify key variations in the cellular differentiation.

2025-02-06

ArXiv (preprint)

Principal Curvatures Estimation with Applications to Single Cell Data

Yanlei Zhang

Lydia Mezrag

Xingzhi Sun

Charles Xu

Kincaid MacDonald

Dhananjay Bhaskar

Bastian Rieck

2025-02-06

ArXiv (preprint)

Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds

Xingzhi Sun

Danqi Liao

Kincaid MacDonald

Yanlei Zhang

Chen Liu

Guillaume Huguet

Ian Adelstein

Tim G. J. Rudner

2025-01-22

aistats.org/AISTATS/2025/Conference (poster)

openreview.net

Geometry-Aware Generative Autoencoder for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds

Xingzhi Sun

Danqi Liao

Kincaid MacDonald

Yanlei Zhang

Guillaume Huguet

Ian Adelstein

Tim G. J. Rudner

Rapid growth of high-dimensional datasets in fields such as single-cell RNA sequencing and spatial genomics has led to unprecedented opportu… (see more)nities for scientific discovery, but it also presents unique computational and statistical challenges. Traditional methods struggle with geometry-aware data generation, interpolation along meaningful trajectories, and transporting populations via feasible paths. To address these issues, we introduce Geometry-Aware Generative Autoencoder (GAGA), a novel framework that combines extensible manifold learning with generative modeling. GAGA constructs a neural network embedding space that respects the intrinsic geometries discovered by manifold learning and learns a novel warped Riemannian metric on the data space. This warped metric is derived from both the points on the data manifold and negative samples off the manifold, allowing it to characterize a meaningful geometry across the entire latent space. Using this metric, GAGA can uniformly sample points on the manifold, generate points along geodesics, and interpolate between populations across the learned manifold. GAGA shows competitive performance in simulated and real-world datasets, including a 30% improvement over SOTA in single-cell population-level trajectory inference.

2025-01-22

aistats.org/AISTATS/2025/Conference (poster)

openreview.net

Mapping the gene space at single-cell resolution with gene signal pattern analysis

Aarthi Venkat

Martina Damo

Samuel Leone

Scott E. Youlten

Nikhil S. Joshi

Eric Fagerberg

John Attanasio

Michael Perlmutter

In single-cell sequencing analysis, several computational methods have been developed to map the cellular state space, but little has been d… (see more)one to map or create embeddings of the gene space. Here, we formulate the gene embedding problem, design tasks with simulated single-cell data to evaluate representations, and establish ten relevant baselines. We then present a graph signal processing approach we call gene signal pattern analysis (GSPA) that learns rich gene representations from single-cell data using a dictionary of diffusion wavelets on the cell-cell graph. GSPA enables characterization of genes based on their patterning on the cellular manifold. It also captures how localized or diffuse the expression of a gene is, for which we present a score called the gene localization score. We motivate and demonstrate the efficacy of GSPA as a framework for a range of biological tasks, such as capturing gene coexpression modules, condition-specific enrichment, and perturbation-specific gene-gene interactions. Then, we showcase the broad utility of gene rep-resentations derived from GSPA, including for cell-cell communication (GSPA-LR), spatial transcriptomics (GSPA-multimodal), and patient response (GSPA-Pt) analysis.

2024-12-20

Nature Computational Science (published)