Portrait of Smita Krishnaswamy

Smita Krishnaswamy

Affiliate Member
Associate Professor, Yale University
Université de Montréal
Yale
Research Topics
AI in Health
Brain-computer Interfaces
Cognitive Science
Computational Biology
Computational Neuroscience
Data Geometry
Data Science
Data Sparsity
Deep Learning
Dynamical Systems
Generative Models
Geometric Deep Learning
Graph Neural Networks
Information Theory
Manifold Learning
Molecular Modeling
Representation Learning
Spectral Learning

Biography

Our lab works on developing foundational mathematical machine learning and deep learning methods that incorporate graph-based learning, signal processing, information theory, data geometry and topology, optimal transport and dynamics modeling that are capable of exploratory analysis, scientific inference, interpretation and hypothesis generation big biomedical datasets ranging from single-cell data, to brain imaging, to molecular structural datasets arising from neuroscience, psychology, stem cell biology, cancer biology, healthcare, and biochemistry. Our works have been instrumental in dynamic trajectory learning from static snapshot data, data denoising, visualization, network inference, molecular structure modeling and more.

Current Students

Collaborating researcher - Yale University
Principal supervisor :

Publications

HypRAG: Hyperbolic Dense Retrieval for Retrieval Augmented Generation
Hiren Madhu
Ngoc Bui
Ali Maatouk
Leandros Tassiulas
Menglin Yang 0001
Sukanta Ganguly
Kiran Srinivasan
Rex Ying
Embedding geometry plays a fundamental role in retrieval quality, yet dense retrievers for retrieval-augmented generation (RAG) remain large… (see more)ly confined to Euclidean space. However, natural language exhibits hierarchical structure from broad topics to specific entities that Euclidean embeddings fail to preserve, causing semantically distant documents to appear spuriously similar and increasing hallucination risk. To address these limitations, we introduce hyperbolic dense retrieval, developing two model variants in the Lorentz model of hyperbolic space: HyTE-FH, a fully hyperbolic transformer, and HyTE-H, a hybrid architecture projecting pre-trained Euclidean embeddings into hyperbolic space. To prevent representational collapse during sequence aggregation, we introduce the Outward Einstein Midpoint, a geometry-aware pooling operator that provably preserves hierarchical structure. On MTEB, HyTE-FH outperforms equivalent Euclidean baselines, while on RAGBench, HyTE-H achieves up to 29% gains over Euclidean baselines in context relevance and answer relevance using substantially smaller models than current state-of-the-art retrievers. Our analysis also reveals that hyperbolic representations encode document specificity through norm-based separation, with over 20% radial increase from general to specific concepts, a property absent in Euclidean embeddings, underscoring the critical role of geometric inductive bias in faithful RAG systems.
Dispersion Loss Counteracts Embedding Condensation and Improves Generalization in Small Language Models
Chen Liu
Xingzhi Sun
Xi Xiao
Alexandre Van Tassel
Ke Xu
Kristof Reimann
Danqi Liao
Mark B. Gerstein
Tianyang Wang
Xiao Wang
Large language models (LLMs) achieve remarkable performance through ever-increasing parameter counts, but scaling incurs steep computational… (see more) costs. To better understand LLM scaling, we study representational differences between LLMs and their smaller counterparts, with the goal of replicating the representational qualities of larger models in the smaller models. We observe a geometric phenomenon which we term
Self-Supervised Visual Prompting for Cross-Domain Road Damage Detection
Xi Xiao
Zhuxuanzi Wang
Mingqiao Mo
Chen Liu
Chenrui Ma
Yanshu Li
Xiao Wang
Tianyang Wang
The deployment of automated pavement defect detection is often hindered by poor cross-domain generalization. Supervised detectors achieve st… (see more)rong in-domain accuracy but require costly re-annotation for new environments, while standard self-supervised methods capture generic features and remain vulnerable to domain shift. We propose \ours, a self-supervised framework that \emph{visually probes} target domains without labels. \ours introduces a Self-supervised Prompt Enhancement Module (SPEM), which derives defect-aware prompts from unlabeled target data to guide a frozen ViT backbone, and a Domain-Aware Prompt Alignment (DAPA) objective, which aligns prompt-conditioned source and target representations. Experiments on four challenging benchmarks show that \ours consistently outperforms strong supervised, self-supervised, and adaptation baselines, achieving robust zero-shot transfer, improved resilience to domain variations, and high data efficiency in few-shot adaptation. These results highlight self-supervised prompting as a practical direction for building scalable and adaptive visual inspection systems. Source code is publicly available: https://github.com/xixiaouab/PROBE/tree/main
Graph topological property recovery with heat and wave dynamics-based features on graphs
Dhananjay Bhaskar
Yanlei Zhang
Charles Xu
Xingzhi Sun
Oluwadamilola Fasina
Maximilian Nickel
Michael Perlmutter
CTR-LoRA: Curvature-Aware and Trust-Region Guided Low-Rank Adaptation for Large Language Models
Zhuxuanzi Wang
Mingqiao Mo
Xi Xiao
Chen Liu
Chenrui Ma
Yunbei Zhang
Xiao Wang
Tianyang Wang
Parameter-efficient fine-tuning (PEFT) has become the standard approach for adapting large language models under limited compute and memory … (see more)budgets. Although previous methods improve efficiency through low-rank updates, quantization, or heuristic budget reallocation, they often decouple the allocation of capacity from the way updates evolve during training. In this work, we introduce CTR-LoRA, a framework guided by curvature trust region that integrates rank scheduling with stability-aware optimization. CTR-LoRA allocates parameters based on marginal utility derived from lightweight second-order proxies and constrains updates using a Fisher/Hessian-metric trust region. Experiments on multiple open-source backbones (7B-13B), evaluated on both in-distribution and out-of-distribution benchmarks, show consistent improvements over strong PEFT baselines. In addition to increased accuracy, CTR-LoRA enhances training stability, reduces memory requirements, and achieves higher throughput, positioning it on the Pareto frontier of performance and efficiency. These results highlight a principled path toward more robust and deployable PEFT.
Equivariant Geometric Scattering Networks via Vector Diffusion Wavelets
David R. Johnson
Rishabh Anand
Michael Perlmutter
We introduce a novel version of the geometric scattering transform for geometric graphs containing scalar and vector node features. This new… (see more) scattering transform has desirable symmetries with respect to rigid-body roto-translations (i.e.,
Equivariant Geometric Scattering Networks via Vector Diffusion Wavelets
David R. Johnson
Rishabh Anand
Michael Perlmutter
HEIST: A Graph Foundation Model for Spatial Transcriptomics and Proteomics Data
Hiren Madhu
João Felipe Rocha
Tinglin Huang
Rex Ying
HEIST: A Graph Foundation Model for Spatial Transcriptomics and
Proteomics Data
Hiren Madhu
João Felipe Rocha
Tinglin Huang
Rex Ying
A Graph Laplacian Eigenvector-based Pre-training Method for Graph Neural Networks
Howard Dai
Nyambura Njenga
Hiren Madhu
Ryan Pellico
Ian Adelstein
The development of self-supervised graph pre-training methods is a crucial ingredient in recent efforts to design robust graph foundation mo… (see more)dels (GFMs). Structure-based pre-training methods are under-explored yet crucial for downstream applications which rely on underlying graph structure. In addition, pre-training traditional message passing GNNs to capture global and regional structure is often challenging due to the risk of oversmoothing as network depth increases. We address these gaps by proposing the Laplacian Eigenvector Learning Module (LELM), a novel pre-training module for graph neural networks (GNNs) based on predicting the low-frequency eigenvectors of the graph Laplacian. Moreover, LELM introduces a novel architecture that overcomes oversmoothing, allowing the GNN model to learn long-range interdependencies. Empirically, we show that models pre-trained via our framework outperform baseline models on downstream molecular property prediction tasks.
Learning Laplacian Eigenvectors: a Pre-training Method for Graph Neural Networks
Howard Dai
Nyambura Njenga
Benjamin Whitsett
Catherine Ma
Darwin Deng
Sara de 'Angel
Alexandre Van Tassel
Ryan Pellico
Ian Adelstein
Low-dimensional embeddings of high-dimensional data
Cyril de Bodt
Alex Diaz-Papkovich
Michael Bleher
Kerstin Bunte
Corinna Coupette
Sebastian Damrich
Fred A. Hamprecht
EmHoke-'Agnes Horv'at
Dhruv Kohli
John A. Lee 0001
Boudewijn P. F. Lelieveldt
Leland McInnes
Ian T. Nabney
Maximilian Noichl
Pavlin G. Polivcar
Bastian Rieck
Gal Mishne … (see 1 more)
Dmitry Kobak
Large collections of high-dimensional data have become nearly ubiquitous across many academic fields and application domains, ranging from b… (see more)iology to the humanities. Since working directly with high-dimensional data poses challenges, the demand for algorithms that create low-dimensional representations, or embeddings, for data visualization, exploration, and analysis is now greater than ever. In recent years, numerous embedding algorithms have been developed, and their usage has become widespread in research and industry. This surge of interest has resulted in a large and fragmented research field that faces technical challenges alongside fundamental debates, and it has left practitioners without clear guidance on how to effectively employ existing methods. Aiming to increase coherence and facilitate future work, in this review we provide a detailed and critical overview of recent developments, derive a list of best practices for creating and using low-dimensional embeddings, evaluate popular approaches on a variety of datasets, and discuss the remaining challenges and open problems in the field.