Portrait of César Miguel Valdez Cordova

César Miguel Valdez Cordova

Lab Representative
PhD - McGill University
Supervisor
Research Topics
Computational Biology
Deep Learning

Biography

I am enthusiastic about harnessing the lessons of who I believe is our greatest teacher –nature– to advance the field of machine learning. As such, I seek to develop algorithms inspired from the intricate challenges of understanding natural phenomena across scales, from molecules to populations, aiming to push the boundaries of both computational methods and fundamental scientific knowledge. Originating from Mexicali, the vibrant capital of the beautiful peninsula of Baja California, Mexico, my scientific journey has been one of diverse perspectives: from hands-on bench work during my Nanotechnology and Chemistry Engineering days at Tec de Monterrey to my current research role at the interface of the computational and natural sciences, enabled by a Master’s in Computer Science at the Ensenada Center for Scientific Research and Higher Education in Ensenada, Mexico and a subsequent Master’s in Artificial Intelligence in JKU Linz, Austria. Confident about the potential of Fair AI to benefit everyone, I seek to be a bridge between people and machines. When not trying to get models to work, you might find me exploring the art of Mexican cuisine, delving into postmodernist literature, dabbling in music production or marveling at and wrangling the magic of fermentation in all its forms and scales.

Publications

scShapeBench: Discovering geometry from high dimensional scRNAseq data
Andrew J. Steindl
João Felipe Rocha
Brian Tshilengi Di Bassinga
Zachary Warren
Shabarni Gupta
Leire Torices
Daniel Neumann
Timothy J. Mann
Ihuan Gunawan
Dhananjay Bhaskar
John G. Lock
Christine L. Chaffer
High-dimensional point cloud data arise across many scientific domains, especially single-cell biology. The shapes or topologies of these da… (see more)tasets determine the types of information that can be extracted. For example, clustered data supports cell-type identification, trajectory structures support transition analysis, and archetypal structures capture continua of cellular behaviors. Existing analysis pipelines often assume a specific shape. The standard Seurat pipeline combines UMAP visualization with Louvain clustering and therefore assumes clustered data, while tools such as Monocle and SPADE assume tree-like structures, and flow-based models such as MIOFlow and Conditional Flow Matching target trajectories. Choosing which pipeline to apply is therefore often left to bioinformaticians who visually inspect datasets before selecting an analysis strategy. With the rise of agentic AI scientists, automating shape detection is increasingly important for selecting downstream analysis pipelines. To address this problem, we introduce scShapeBench, a benchmark dataset for shape detection containing both synthetic and expert-annotated single-cell datasets. Synthetic datasets are sampled from ground-truth skeleton graphs with controlled variance. Real single-cell datasets are curated from diverse sources and annotated by experts into four categories: clusters, single trajectory, multi-branching, and archetypal. We additionally introduce scReebTower, a baseline method that uses diffusion geometry to extract Reeb graphs and connect visualization with pipeline selection. We provide topology-aware evaluation metrics and compare scReebTower against PAGA and Mapper on synthetic and real data. Our results indicate that scReebTower outperforms existing baselines. Overall, our contributions span benchmarks, evaluation metrics, and a baseline for automated shape detection in single-cell data.
Measure Before You Look: Grounding Embeddings Through Manifold Metrics