Portrait of Jack Stanley

Jack Stanley

PhD - McGill University
Supervisor
Research Topics
Computational Biology
Deep Learning
Medical Machine Learning
Representation Learning

Publications

From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers
As generative AI systems become competent and democratized in science, business, and government, deeper insight into their failure modes now… (see more) poses an acute need. The occasional volatility in their behavior, such as the propensity of transformer models to hallucinate, impedes trust and adoption of emerging AI solutions in high-stakes areas. In the present work, we establish how and when hallucinations arise in pre-trained transformer models through concept representations captured by sparse autoencoders, under scenarios with experimentally controlled uncertainty in the input space. Our systematic experiments reveal that the number of semantic concepts used by the transformer model grows as the input information becomes increasingly unstructured. In the face of growing uncertainty in the input space, the transformer model becomes prone to activate coherent yet input-insensitive semantic features, leading to hallucinated output. At its extreme, for pure-noise inputs, we identify a wide variety of robustly triggered and meaningful concepts in the intermediate activations of pre-trained transformer models, whose functional integrity we confirm through targeted steering. We also show that hallucinations in the output of a transformer model can be reliably predicted from the concept patterns embedded in transformer layer activations. This collection of insights on transformer internal processing mechanics has immediate consequences for aligning AI models with human values, AI safety, opening the attack surface for potential adversarial attacks, and providing a basis for automatic quantification of a model's hallucination risk.
Mitochondria‐nucleus crosstalk characterizes Alzheimer's disease across 1,5 million brain cells
Emerging insight from stem cell research reinforces Alzheimer's disease (AD) to affect mitochondrial protein expression. Compelling new evid… (see more)ence points to mitochondrial reactive oxygen species (ROS) as potential driving player in Aβ toxicity, mediated through glial cells and ultimately impacting neuronal health. A comprehensive understanding of how oxidative phosphorylation variations relate to cell function remains largely unexplored, especially through a cell type lens. Leveraging today's largest single‐nucleus RNA sequencing dataset of AD, we unveil how cell‐type‐specific mitochondrial alterations reverberate in the nuclear transcriptome, in 424 AD patients and healthy controls from ROSMAP. By adopting a supervised latent factor modelling approach, we identified distinct gene modules capturing unique aspects of the mitochondrial crosstalk in 6 major brain cell types across 5,427 nuclear and 13 mitochondrial genes. We found that nuclear‐mitochondrial crosstalk varies distinctly with cell identity, reflecting metabolic demands and functional specialization. In neurons and oligodendrocytes, ATP synthase (complex V) takes a central role, whereas type 1 NADH dehydrogenase (complex I) is more prominent in astrocytes, microglia, and OPCs. Screening across >1 million gene expression profiles from ∼20,000 drug perturbations identified mitochondrial‐nuclear signatures that resemble those activated by parthenolide and niclosamide—two chemical compounds previously associated with oxidative stress and cytotoxicity via ubiquitination—as most predictive of AD. Microglia and OPCs achieved the highest overall classification accuracy, with stronger predictive performance observed in males than in females. Mapping gene module expressions to the Allen Human Brain Atlas revealed shared whole‐brain patterns highlighting the precuneus, which we implicated in ubiquitin‐cascade‐enriched modules. Clinical phenotyping revealed that males with higher AD risk, as indicated by their mitochondrial‐nuclear scores on glial gene modules, exhibited a greater pathological burden, including higher amyloid load, Parkinson's‐like symptoms, and neuroticism‐related traits. Finally, by comparing our findings with 2.5 million CRISPRi‐based perturbations, we identified neural signatures associated with female‐biased transcription factors and fatty acid biosynthesis, while glial signatures were linked to DNA damage and oxidative stress. By integrating multiple layers of biological data from established reference atlases, our analysis of mitochondria‐nuclear crosstalk revealed distinct transcriptional signatures associated with AD risk in glial and neural cells, with these associations exhibiting sex‐biased patterns.
Recovering undersampled single-cell transcriptomes with HyperCell
Abstract

Single-cell transcriptomic technology has now matured, allowing quantification of mRNA transcripts corres… (see more)ponding to tens of thousands of genes within a cell. However, still only a small fraction of these mRNA is captured and measured by today’s single-cell assays. There are likely hundreds of thousands of mRNA copies present within a typical human cell, yet these assays omit a majority of the transcripts that are actually present. This introduces technical noise, especially non-biological variability and excessive sparsity, which frustrates downstream analysis and potentially skews biological conclusions. To overcome these challenges, we here develop HyperCell, a probabilistic deep learning approach that explicitly models this undersampling to produce estimates of each cell’s original gene transcript abundances across the whole transcriptome. We demonstrate that our framework offers benefits in various mRNA modeling settings, by i) correctly differentiating between spurious sampling-induced and real biological zeros, outperforming existing approaches, ii) estimating the total mRNA content of cells across states to reduce contamination due to background transcripts, iii) reducing contamination due to background transcripts, and iv) helping to counteract biases that may appear during typical differential gene expression analyses using widespread normalization approaches. Our approach to correcting for the technical noise introduced by the single-cell experimental process brings us closer to studying biology, starting from the true transcriptome of cells.

Large language models deconstruct the clinical intuition behind diagnosing autism
Emmett Rabot
Laurent Mottron
Cell type transcriptomics reveal shared genetic mechanisms in Alzheimer’s and Parkinson’s disease
Edward A. Fon
Alain Dagher
Yasser Iturria-Medina
Jo Anne Stratton
David A Bennett
Historically, Alzheimer’s disease (AD) and Parkinson’s disease (PD) have been investigated as two distinct disorders of the brain. Howev… (see more)er, a few similarities in neuropathology and clinical symptoms have been documented over the years. Traditional single gene-centric genetic studies, including GWAS and differential gene expression analyses, have struggled to unravel the molecular links between AD and PD. To address this, we tailor a pattern-learning framework to analyze synchronous gene co-expression at sub-cell-type resolution. Utilizing recently published single-nucleus AD (70,634 nuclei) and PD (340,902 nuclei) datasets from postmortem human brains, we systematically extract and juxtapose disease-critical gene modules. Our findings reveal extensive molecular similarities between AD and PD gene cliques. In neurons, disrupted cytoskeletal dynamics and mitochondrial stress highlight convergence in key processes; glial modules share roles in T-cell activation, myelin synthesis, and synapse pruning. This multi-module sub-cell-type approach offers insights into the molecular basis of shared neuropathology in AD and PD.
Cell type transcriptomics reveal shared genetic mechanisms in Alzheimer’s and Parkinson’s disease
Edward A. Fon
Alain Dagher
Yasser Iturria-Medina
Jo Anne Stratton
David A Bennett
Historically, Alzheimer’s disease (AD) and Parkinson’s disease (PD) have been investigated as two distinct disorders of the brain. Howev… (see more)er, a few similarities in neuropathology and clinical symptoms have been documented over the years. Traditional single gene-centric genetic studies, including GWAS and differential gene expression analyses, have struggled to unravel the molecular links between AD and PD. To address this, we tailor a pattern-learning framework to analyze synchronous gene co-expression at sub-cell-type resolution. Utilizing recently published single-nucleus AD (70,634 nuclei) and PD (340,902 nuclei) datasets from postmortem human brains, we systematically extract and juxtapose disease-critical gene modules. Our findings reveal extensive molecular similarities between AD and PD gene cliques. In neurons, disrupted cytoskeletal dynamics and mitochondrial stress highlight convergence in key processes; glial modules share roles in T-cell activation, myelin synthesis, and synapse pruning. This multi-module sub-cell-type approach offers insights into the molecular basis of shared neuropathology in AD and PD.