Portrait de Smita Krishnaswamy

Smita Krishnaswamy

Membre affilié
Professeure associée, Yale University
Université de Montréal
Yale
Sujets de recherche
Apprentissage de représentations
Apprentissage profond
Apprentissage profond géométrique
Apprentissage spectral
Apprentissage sur variétés
Biologie computationnelle
Géométrie des données
IA en santé
Interfaces cerveau-ordinateur
Modèles génératifs
Modélisation moléculaire
Neurosciences computationnelles
Parcimonie des données
Réseaux de neurones en graphes
Science cognitive
Science des données
Systèmes dynamiques
Théorie de l'information

Biographie

Notre laboratoire travaille sur le développement de méthodes mathématiques fondamentales d'apprentissage automatique et d'apprentissage profond qui intègrent l'apprentissage basé sur les graphes, le traitement du signal, la théorie de l'information, la géométrie et la topologie des données, le transport optimal et la modélisation dynamique qui sont capables d'effectuer une analyse exploratoire, une inférence scientifique, une interprétation et une génération d'hypothèses de grands ensembles de données biomédicales allant des données de cellules uniques, à l'imagerie cérébrale, aux ensembles de données structurelles moléculaires provenant des neurosciences, de la psychologie, de la biologie des cellules souches, de la biologie du cancer, des soins de santé, et de la biochimie. Nos travaux ont été déterminants pour l'apprentissage de trajectoires dynamiques à partir de données instantanées statiques, le débruitage des données, la visualisation, l'inférence de réseaux, la modélisation de structures moléculaires et bien d'autres choses encore.

Étudiants actuels

Collaborateur·rice de recherche - Yale University
Superviseur⋅e principal⋅e :

Publications

ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics
Dhananjay Bhaskar
David R. Johnson
João Felipe Rocha
Egbert Castro
Jackson Grady
Alex T. Grigas
Michael Perlmutter
Corey S. O'Hern
Understanding the dynamic nature of protein structures is essential for comprehending their biological functions. While significant progress… (voir plus) has been made in predicting static folded structures, modeling protein motions on microsecond to millisecond scales remains challenging. To address these challenges, we introduce a novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE), which leverages the geometric scattering transform alongside transformer-based attention mechanisms to capture protein dynamics from molecular dynamics (MD) simulations. ProtSCAPE utilizes the multi-scale nature of the geometric scattering transform to extract features from protein structures conceptualized as graphs and integrates these features with dual attention structures that focus on residues and amino acid signals, generating latent representations of protein trajectories. Furthermore, ProtSCAPE incorporates a regression head to enforce temporally coherent latent representations.
Abstract PR-05: Endocrine beta-cell stress promotes pancreatic ductal adenocarcinoma through endocrine-exocrine cell crosstalk
Cathy C. Garcia
Aarthi Venkat
Daniel C. McQuaid
Sherry Agabiti
Rebecca Cardone
Richard G. Kibbey
Mandar Deepak Muzumdar
For a long time, the pancreas was thought to have separate cellular compartments that functioned distinctly from one another. The endocrine … (voir plus)pancreas (islets of Langerhans) regulates glucose homeostasis, while the exocrine pancreas (acini and ducts) produces and secretes digestive enzymes. However, it has recently become clear that the endocrine and exocrine compartments communicate with one another, and dysfunction in one leads to dysfunction in the other, resulting in diabetes or pancreatitis. However, whether and how the endocrine pancreas drives the development of pancreatic ductal adenocarcinoma (PDAC), an exocrine tumor, remains unresolved. Strikingly, we found that genetic ablation of insulin-producing islet beta (β) cells (Akita) in a faithful Kras/Trp53-driven PDAC model (KPC: Kras LSL-G12D /+; Trp 53172 /+; Pdx1-Cre) suppressed PDAC progression. Conversely, obesity-induced β cell hormone dysregulation promoted Kras-driven PDAC development. Single-cell RNA sequencing (scRNA-seq) analysis of wild-type and obese mice (high-fat diet-fed and leptin-deficient (Lep ob/ob )) revealed increased expression of the peptide hormone cholecystokinin (CCK) in a subset of β cells concordant with increasing obesity, and transgenic β cell overexpression of CCK was sufficient to promote exocrine tumorigenesis in KC mice. Combined in silico (pseudotime (TrajectoryNET) and archetypal (AANet) analysis) and experimental (CreER) lineage tracing demonstrated that CCK-expressing β cells originated from a pre-existing immature β cell population (virgin β cells). Grainger causality analysis of transcriptional networks uncovered a stress-induced JNK-cJun pathway that promotes CCK expression β cells, which we confirmed using JNK inhibitors in β cell models. Together, our findings identify cellular and molecular mechanisms of β cell adaptation to obesity that contribute to obesity-driven pancreatic cancer. Furthermore, we define a critical role for endocrine-exocrine signaling in PDAC progression and stress-induced β cell pathways which could be leveraged to target the endocrine pancreas to subvert exocrine tumorigenesis. Citation Format: Cathy Garcia, Aarthi Venkat, Daniel McQuaid, Sherry Agabiti, Alex Tong, Rebecca Cardone, Richard Kibbey, Smita Krishnaswamy, Mandar Muzumdar. Endocrine beta-cell stress promotes pancreatic ductal adenocarcinoma through endocrine-exocrine cell crosstalk [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Advances in Pancreatic Cancer Research; 2024 Sep 15-18; Boston, MA. Philadelphia (PA): AACR; Cancer Res 2024;84(17 Suppl_2):Abstract nr PR-05.
ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images
Chen Liu
Ke Xu
Liangbo L. Shen
Jay Stewart
Jay C. Wang
Lucian V. Del Priore
Advances in medical imaging technologies have enabled the collection of longitudinal images, which involve repeated scanning of the same pat… (voir plus)ients over time, to monitor disease progression. However, predictive modeling of such data remains challenging due to high dimensionality, irregular sampling, and data sparsity. To address these issues, we propose ImageFlowNet, a novel model designed to forecast disease trajectories from initial images while preserving spatial details. ImageFlowNet first learns multiscale joint representation spaces across patients and time points, then optimizes deterministic or stochastic flow fields within these spaces using a position-parameterized neural ODE/SDE framework. The model leverages a UNet architecture to create robust multiscale representations and mitigates data scarcity by combining knowledge from all patients. We provide theoretical insights that support our formulation of ODEs, and motivate our regularizations involving high-level visual features, latent space organization, and trajectory smoothness. We validate ImageFlowNet on three longitudinal medical image datasets depicting progression in geographic atrophy, multiple sclerosis, and glioblastoma, demonstrating its ability to effectively forecast disease progression and outperform existing methods. Our contributions include the development of ImageFlowNet, its theoretical underpinnings, and empirical validation on real-world datasets. The official implementation is available at https://github.com/KrishnaswamyLab/ImageFlowNet.
Geometry-Aware Generative Autoencoders for Metric Learning and Generative Modeling on Data Manifolds
Xingzhi Sun
Danqi Liao
Kincaid MacDonald
Yanlei Zhang
Ian Adelstein
Tim G. J. Rudner
Non-linear dimensionality reduction methods have proven successful at learning low-dimensional representations of high-dimensional point clo… (voir plus)uds on or near data manifolds. However, existing methods are not easily extensible—that is, for large datasets, it is prohibitively expensive to add new points to these embeddings. As a result, it is very difficult to use existing embeddings generatively, to sample new points on and along these manifolds. In this paper, we propose GAGA (geometry-aware generative autoencoders) a framework which merges the power of generative deep learning with non-linear manifold learning by: 1) learning generalizable geometry-aware neural network embeddings based on non-linear dimensionality reduction methods like PHATE and diffusion maps, 2) deriving a non-euclidean pullback metric on the embedded space to generate points faithfully along manifold geodesics, and 3) learning a flow on the manifold that allows us to transport populations. We provide illustration on easily-interpretable synthetic datasets and showcase results on simulated and real single cell datasets. In particular, we show that the geodesic-based generation can be especially important for scientific datasets where the manifold represents a state space and geodesics can represent dynamics of entities over this space.
Inferring Metabolic States from Single Cell Transcriptomic Data via Geometric Deep Learning
Holly Steach
Yixuan He
Xitong Zhang
Natalia Ivanova
Matthew Hirn
Michael Perlmutter
Supervised latent factor modeling isolates cell-type-specific transcriptomic modules that underlie Alzheimer’s disease progression
Yasser Iturria-Medina
Jo Anne Stratton
David A. Bennett
Late onset Alzheimer’s disease (AD) is a progressive neurodegenerative disease, with brain changes beginning years before symptoms surface… (voir plus). AD is characterized by neuronal loss, the classic feature of the disease that underlies brain atrophy. However, GWAS reports and recent single-nucleus RNA sequencing (snRNA-seq) efforts have highlighted that glial cells, particularly microglia, claim a central role in AD pathophysiology. Here, we tailor pattern-learning algorithms to explore distinct gene programs by integrating the entire transcriptome, yielding distributed AD-predictive modules within the brain’s major cell-types. We show that these learned modules are biologically meaningful through the identification of new and relevant enriched signaling cascades. The predictive nature of our modules, especially in microglia, allows us to infer each subject’s progression along a disease pseudo-trajectory, confirmed by post-mortem pathological brain tissue markers. Additionally, we quantify the interplay between pairs of cell-type modules in the AD brain, and localized known AD risk genes to enriched module gene programs. Our collective findings advocate for a transition from cell-type-specificity to gene modules specificity to unlock the potential of unique gene programs, recasting the roles of recently reported genome-wide AD risk loci. Designing a supervised latent factor framework for snRNA-seq human brain, the authors find distinct Alzheimer’s-predictive gene modules across celltypes, suggesting subcelltype disease progression trajectories.
Novel cell states arise in embryonic cells devoid of key reprogramming factors
Scott E. Youlten
Liyun Miao
Caroline Hoppe
Curtis W. Boswell
Damir Musaev
Mario Abdelmessih
Valerie A. Tornini
Antonio J. Giraldez
The capacity for embryonic cells to differentiate relies on a large-scale reprogramming of the oocyte and sperm nucleus into a transient tot… (voir plus)ipotent state. In zebrafish, this reprogramming step is achieved by the pioneer factors Nanog, Pou5f3, and Sox19b (NPS). Yet, it remains unclear whether cells lacking this reprogramming step are directed towards wild type states or towards novel developmental canals in the Waddington landscape of embryonic development. Here we investigate the developmental fate of embryonic cells mutant for NPS by analyzing their single-cell gene expression profiles. We find that cells lacking the first developmental reprogramming steps can acquire distinct cell states. These states are manifested by gene expression modules that result from a failure of nuclear reprogramming, the persistence of the maternal program, and the activation of somatic compensatory programs. As a result, most mutant cells follow new developmental canals and acquire new mixed cell states in development. In contrast, a group of mutant cells acquire primordial germ cell-like states, suggesting that NPS-dependent reprogramming is dispensable for these cell states. Together, these results demonstrate that developmental reprogramming after fertilization is required to differentiate most canonical developmental programs, and loss of the transient totipotent state canalizes embryonic cells into new developmental states in vivo.
AAnet resolves a continuum of spatially-localized cell states to unveil tumor complexity
Aarthi Venkat
Scott E. Youlten
Beatriz P. San Juan
Carley Purcell
Matthew Amodio
Daniel B. Burkhardt
Andrew Benz
Jeff Holst
Cerys McCool
Annelie Mollbrink
Joakim Lundeberg
David van Dijk
Leonard D. Goldstein
Sarah Kummerfeld
Christine L. Chaffer
Identifying functionally important cell states and structure within a heterogeneous tumor remains a significant biological and computational… (voir plus) challenge. Moreover, current clustering or trajectory-based computational models are ill-equipped to address the notion that cancer cells reside along a phenotypic continuum. To address this, we present Archetypal Analysis network (AAnet), a neural network that learns key archetypal cell states within a phenotypic continuum of cell states in single-cell data. Applied to single-cell RNA sequencing data from pre-clinical models and a cohort of 34 clinical breast cancers, AAnet identifies archetypes that resolve distinct biological cell states and processes, including cell proliferation, hypoxia, metabolism and immune interactions. Notably, archetypes identified in primary tumors are recapitulated in matched liver, lung and lymph node metastases, demonstrating that a significant component of intratumoral heterogeneity is driven by cell intrinsic properties. Using spatial transcriptomics as orthogonal validation, AAnet-derived archetypes show discrete spatial organization within tumors, supporting their distinct archetypal biology. We further reveal that ligand:receptor cross-talk between cancer and adjacent stromal cells contributes to intra-archetypal biological mimicry. Finally, we use AAnet archetype identifiers to validate GLUT3 as a critical mediator of a hypoxic cell archetype harboring a cancer stem cell population, which we validate in human triple-negative breast cancer specimens. AAnet is a powerful tool to reveal functional cell states within complex samples from multimodal single-cell data.
BLIS-Net: Classifying and Analyzing Signals on Graphs
Charles Xu
Laney Goldman
Valentina Guo
Benjamin Hollander-Bodie
Maedee Trank-Greene
Ian Adelstein
Edward De Brouwer
Rex Ying
Michael Perlmutter
Graph neural networks (GNNs) have emerged as a powerful tool for tasks such as node classification and graph classification. However, much l… (voir plus)ess work has been done on signal classification, where the data consists of many functions (referred to as signals) defined on the vertices of a single graph. These tasks require networks designed differently from those designed for traditional GNN tasks. Indeed, traditional GNNs rely on localized low-pass filters, and signals of interest may have intricate multi-frequency behavior and exhibit long range interactions. This motivates us to introduce the BLIS-Net (Bi-Lipschitz Scattering Net), a novel GNN that builds on the previously introduced geometric scattering transform. Our network is able to capture both local and global signal structure and is able to capture both low-frequency and high-frequency information. We make several crucial changes to the original geometric scattering architecture which we prove increase the ability of our network to capture information about the input signal and show that BLIS-Net achieves superior performance on both synthetic and real-world data sets based on traffic flow and fMRI data.
Directed Scattering for Knowledge Graph-Based Cellular Signaling Analysis
Aarthi Venkat
Joyce Chew
Ferran Cardoso Rodriguez
Christopher J. Tape
Michael Perlmutter
Directed graphs are a natural model for many phenomena, in particular scientific knowledge graphs such as molecular interaction or chemical … (voir plus)reaction networks that define cellular signaling relationships. In these situations, source nodes typically have distinct biophysical properties from sinks. Due to their ordered and unidirectional relationships, many such networks also have hierarchical and multiscale structure. However, the majority of methods performing node- and edge-level tasks in machine learning do not take these properties into account, and thus have not been leveraged effectively for scientific tasks such as cellular signaling network inference. We propose a new framework called Directed Scattering Autoencoder (DSAE) which uses a directed version of a geometric scattering transform, combined with the non-linear dimensionality reduction properties of an autoencoder and the geometric properties of the hyperbolic space to learn latent hierarchies. We show this method outperforms numerous others on tasks such as embedding directed graphs and learning cellular signaling networks.
Bayesian Spectral Graph Denoising with Smoothness Prior
Samuel Leone
Xingzhi Sun
Michael Perlmutter
Here we consider the problem of denoising features associated to complex data, modeled as signals on a graph, via a smoothness prior. This i… (voir plus)s motivated in part by settings such as single-cell RNA where the data is very high-dimensional, but its structure can be captured via an affinity graph. This allows us to utilize ideas from graph signal processing. In particular, we present algorithms for the cases where the signal is perturbed by Gaussian noise, dropout, and uniformly distributed noise. The signals are assumed to follow a prior distribution defined in the frequency domain which favors signals which are smooth across the edges of the graph. By pairing this prior distribution with our three models of noise generation, we propose Maximum A Posteriori (M.A.P.) estimates of the true signal in the presence of noisy data and provide algorithms for computing the M.A.P. Finally, we demonstrate the algorithms’ ability to effectively restore signals from white noise on image data and from severe dropout in single-cell RNA sequence data.
Abstract B049: Pancreatic beta cell stress pathways drive pancreatic ductal adenocarcinoma development in obesity
Cathy C. Garcia
Aarthi Venkat
Sherry Agabiti
Lauren Lawres
Rebecca Cardone
Richard G. Kibbey
Mandar Deepak Muzumdar