Portrait de Smita Krishnaswamy

Smita Krishnaswamy

Membre affilié
Professeure associée, Yale University
Université de Montréal
Yale
Sujets de recherche
Apprentissage de représentations
Apprentissage profond
Apprentissage profond géométrique
Apprentissage spectral
Apprentissage sur variétés
Biologie computationnelle
Géométrie des données
IA en santé
Interfaces cerveau-ordinateur
Modèles génératifs
Modélisation moléculaire
Neurosciences computationnelles
Parcimonie des données
Réseaux de neurones en graphes
Science cognitive
Science des données
Systèmes dynamiques
Théorie de l'information

Biographie

Notre laboratoire travaille sur le développement de méthodes mathématiques fondamentales d'apprentissage automatique et d'apprentissage profond qui intègrent l'apprentissage basé sur les graphes, le traitement du signal, la théorie de l'information, la géométrie et la topologie des données, le transport optimal et la modélisation dynamique qui sont capables d'effectuer une analyse exploratoire, une inférence scientifique, une interprétation et une génération d'hypothèses de grands ensembles de données biomédicales allant des données de cellules uniques, à l'imagerie cérébrale, aux ensembles de données structurelles moléculaires provenant des neurosciences, de la psychologie, de la biologie des cellules souches, de la biologie du cancer, des soins de santé, et de la biochimie. Nos travaux ont été déterminants pour l'apprentissage de trajectoires dynamiques à partir de données instantanées statiques, le débruitage des données, la visualisation, l'inférence de réseaux, la modélisation de structures moléculaires et bien d'autres choses encore.

Publications

Neurospectrum: A Geometric and Topological Deep Learning Framework for Uncovering Spatiotemporal Signatures in Neural Activity
Dhananjay Bhaskar
Jessica Moore
Feng Gao
Bastian Rieck
Firas Khasawneh
Elizabeth Munch
Valentina Greco
Neural signals are high-dimensional, noisy, and dynamic, making it challenging to extract interpretable features linked to behavior or disea… (voir plus)se. We introduce Neurospectrum, a framework that encodes neural activity as latent trajectories shaped by spatial and temporal structure. At each timepoint, signals are represented on a graph capturing spatial relationships, with a learnable attention mechanism highlighting important regions. These are embedded using graph wavelets and passed through a manifold-regularized autoencoder that preserves temporal geometry. The resulting latent trajectory is summarized using a principled set of descriptors - including curvature, path signatures, persistent homology, and recurrent networks -that capture multiscale geometric, topological, and dynamical features. These features drive downstream prediction in a modular, interpretable, and end-to-end trainable framework. We evaluate Neurospectrum on simulated and experimental datasets. It tracks phase synchronization in Kuramoto simulations, reconstructs visual stimuli from calcium imaging, and identifies biomarkers of obsessive-compulsive disorder in fMRI. Across tasks, Neurospectrum uncovers meaningful neural dynamics and outperforms traditional analysis methods.
ImmunoStruct: a multimodal neural network framework for immunogenicity prediction from peptide-MHC sequence, structure, and biochemical properties
Kevin Bijan Givechian
João Felipe Rocha
Edward Yang
Chen Liu
Kerrie Greene
Rex Ying
Etienne Caron
Akiko Iwasaki
ImmunoStruct: a multimodal neural network framework for immunogenicity prediction from peptide-MHC sequence, structure, and biochemical properties
Kevin Bijan Givechian
João Felipe Rocha
Edward Yang
Chen Liu
Kerrie Greene
Rex Ying
Etienne Caron
Akiko Iwasaki
InfoGain Wavelets: Furthering the Design of Diffusion Wavelets for Graph-Structured Data
David R. Johnson
Michael Perlmutter
Accelerated learning of a noninvasive human brain-computer interface via manifold geometry
Erica Lindsey Busch
E. Chandra Fincke
Nicholas B Turk-Browne
HiPoNet: A Topology-Preserving Multi-View Neural Network For High Dimensional Point Cloud and Single-Cell Data
Siddharth Viswanath
Hiren Madhu
Dhananjay Bhaskar
Jake Kovalic
Dave Johnson
Rex Ying
Christopher Tape
Ian Adelstein
Michael Perlmutter
In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning o… (voir plus)n high-dimensional point clouds. Single-cell data can have high dimensionality exceeding the capabilities of existing methods point cloud tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e. one on every patient), necessitating models that can process large, high-dimensional point clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric information. In contrast, HiPoNet forms higher-order simplicial complexes through learnable feature reweighting, generating multiple data views that disentangle distinct biological processes. It then employs simplicial wavelet transforms to extract multi-scale features - capturing both local and global topology. We empirically show that these components preserve topological information in the learned representations, and that HiPoNet significantly outperforms state-of-the-art point-cloud and graph-based models on single cell. We also show an application of HiPoNet on spatial transcriptomics datasets using spatial co-ordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.
HiPoNet: A Topology-Preserving Multi-View Neural Network For High Dimensional Point Cloud and Single-Cell Data
Siddharth Viswanath
Hiren Madhu
Dhananjay Bhaskar
Jake Kovalic
David R. Johnson
Rex Ying
Christopher Tape
Ian Adelstein
Michael Perlmutter
In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning o… (voir plus)n high-dimensional point clouds. Single-cell data can have high dimensionality exceeding the capabilities of existing methods point cloud tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e. one on every patient), necessitating models that can process large, high-dimensional point clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric information. In contrast, HiPoNet forms higher-order simplicial complexes through learnable feature reweighting, generating multiple data views that disentangle distinct biological processes. It then employs simplicial wavelet transforms to extract multi-scale features - capturing both local and global topology. We empirically show that these components preserve topological information in the learned representations, and that HiPoNet significantly outperforms state-of-the-art point-cloud and graph-based models on single cell. We also show an application of HiPoNet on spatial transcriptomics datasets using spatial co-ordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.
Principal Curvatures Estimation with Applications to Single Cell Data
Yanlei Zhang
Lydia Mezrag
Xingzhi Sun
Charles Xu
Kincaid MacDonald
Dhananjay Bhaskar
Bastian Rieck
The rapidly growing field of single-cell transcriptomic sequencing (scRNAseq) presents challenges for data analysis due to its massive datas… (voir plus)ets. A common method in manifold learning consists in hypothesizing that datasets lie on a lower dimensional manifold. This allows to study the geometry of point clouds by extracting meaningful descriptors like curvature. In this work, we will present Adaptive Local PCA (AdaL-PCA), a data-driven method for accurately estimating various notions of intrinsic curvature on data manifolds, in particular principal curvatures for surfaces. The model relies on local PCA to estimate the tangent spaces. The evaluation of AdaL-PCA on sampled surfaces shows state-of-the-art results. Combined with a PHATE embedding, the model applied to single-cell RNA sequencing data allows us to identify key variations in the cellular differentiation.
Principal Curvatures Estimation with Applications to Single Cell Data
Yanlei Zhang
Lydia Mezrag
Xingzhi Sun
Charles Xu
Kincaid MacDonald
Dhananjay Bhaskar
Bastian Rieck
Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds
Xingzhi Sun
Danqi Liao
Kincaid MacDonald
Yanlei Zhang
Chen Liu
Guillaume Huguet
Ian Adelstein
Tim G. J. Rudner
Geometry-Aware Generative Autoencoder for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds
Xingzhi Sun
Danqi Liao
Kincaid MacDonald
Yanlei Zhang
Guillaume Huguet
Ian Adelstein
Tim G. J. Rudner
Rapid growth of high-dimensional datasets in fields such as single-cell RNA sequencing and spatial genomics has led to unprecedented opportu… (voir plus)nities for scientific discovery, but it also presents unique computational and statistical challenges. Traditional methods struggle with geometry-aware data generation, interpolation along meaningful trajectories, and transporting populations via feasible paths. To address these issues, we introduce Geometry-Aware Generative Autoencoder (GAGA), a novel framework that combines extensible manifold learning with generative modeling. GAGA constructs a neural network embedding space that respects the intrinsic geometries discovered by manifold learning and learns a novel warped Riemannian metric on the data space. This warped metric is derived from both the points on the data manifold and negative samples off the manifold, allowing it to characterize a meaningful geometry across the entire latent space. Using this metric, GAGA can uniformly sample points on the manifold, generate points along geodesics, and interpolate between populations across the learned manifold. GAGA shows competitive performance in simulated and real-world datasets, including a 30% improvement over SOTA in single-cell population-level trajectory inference.
Mapping the gene space at single-cell resolution with gene signal pattern analysis
Aarthi Venkat
Martina Damo
Samuel Leone
Scott E. Youlten
Nikhil S. Joshi
Eric Fagerberg
John Attanasio
Michael Perlmutter
In single-cell sequencing analysis, several computational methods have been developed to map the cellular state space, but little has been d… (voir plus)one to map or create embeddings of the gene space. Here, we formulate the gene embedding problem, design tasks with simulated single-cell data to evaluate representations, and establish ten relevant baselines. We then present a graph signal processing approach we call gene signal pattern analysis (GSPA) that learns rich gene representations from single-cell data using a dictionary of diffusion wavelets on the cell-cell graph. GSPA enables characterization of genes based on their patterning on the cellular manifold. It also captures how localized or diffuse the expression of a gene is, for which we present a score called the gene localization score. We motivate and demonstrate the efficacy of GSPA as a framework for a range of biological tasks, such as capturing gene coexpression modules, condition-specific enrichment, and perturbation-specific gene-gene interactions. Then, we showcase the broad utility of gene rep-resentations derived from GSPA, including for cell-cell communication (GSPA-LR), spatial transcriptomics (GSPA-multimodal), and patient response (GSPA-Pt) analysis.