Portrait of Danilo Bzdok

Danilo Bzdok

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, McGill University, Department of Biomedical Engineering
Research Topics
Computational Biology
Deep Learning
Large Language Models (LLM)
Natural Language Processing

Biography

Danilo Bzdok is a computer scientist and medical doctor by training with a unique dual background in systems neuroscience and machine learning algorithms. After training at RWTH Aachen University (Germany), Université de Lausanne (Switzerland) and Harvard Medical School, Bzdok completed two doctoral degrees, one in neuroscience at Forschungszentrum Jülich in Germany, and another in computer science (machine learning statistics) at INRIA–Saclay and the Neurospin brain imaging centre in Paris.

Danilo is currently an associate professor at McGill University’s Faculty of Medicine and a Canada CIFAR AI Chair at Mila – Quebec Artificial Intelligence Institute. His interdisciplinary research centres around narrowing knowledge gaps in the brain basis of human-defining types of thinking in order to uncover key computational design principles underlying human intelligence.

Current Students

PhD - McGill University
PhD - McGill University
PhD - McGill University
Collaborating researcher - CentraleSupélec
PhD - McGill University
Collaborating researcher - École Polytechnique Montréal Paris
PhD - McGill University
Postdoctorate - McGill University
Master's Research - McGill University
Independent visiting researcher - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University
Collaborating researcher - Aix-Marseille Université
PhD - McGill University
PhD - McGill University

Publications

Carriers of LRRK2 pathogenic variants show a milder, anatomically distinct brain signature of Parkinson's disease
Andrew Vo
Qin Tao
Tanya Simuni
Lana M. Chahine
Alain Dagher
Pathogenic LRRK2 gene variants are a major genetic risk factor for both familial and sporadic Pa… (see more)rkinson’s dissease (PD), opening an unattended window into disease mechanisms and potential therapies. Investigating the influence of pathogenic variants in LRRK2 gene on brain structure is a crucial step toward enabling early diagnosis and personalized treatment. Yet, despite its significance, the ways in which LRRK2 genotype affects brain structure remain largely unexplored. Work in this domain is plagued by small sample sizes and differences in cohort composition, which can obscure genuine distinctions among clinical subgroups. In this study, we overcome such important limitations by combining explicit modeling of population background variation and pattern matching. Specifically, we leverage a cohort of 603 participants (including 370 with a PD diagnosis) to examine MRI-detectable cortical atrophy patterns associated with the LRRK2 pathogenic variants in people with PD and carriers without Parkinson’s symptoms. LRRK2 PD patients exhibit milder cortical thinning compared to sporadic PD, with notable preservation in temporal and occipital regions, suggesting a distinct pattern of neurodegeneration. Non-manifesting LRRK2 carriers show no significant cortical atrophy, indicating no structural signs of subclinical PD. We further analyze the relationship between aggregated alpha-synuclein in cerebrospinal fluid and atrophy. We find that those with evidence of aggregated alpha-synuclein experienced pronounced neurodegeneration and increased cortical thinning, possibly defining another aggressive PD subtype. Our findings highlight genetic avenues for distinguishing PD subtypes, which could lead to more targeted treatment approaches and a more complete understanding of Parkinson’s disease progression.
Quantifying LLM Attention-Head Stability: Implications for Circuit Universality.
In mechanistic interpretability, recent work scrutinizes transformer"circuits"- sparse, mono or multi layer sub computations, that may refle… (see more)ct human understandable functions. Yet, these network circuits are rarely acid-tested for their stability across different instances of the same deep learning architecture. Without this, it remains unclear whether reported circuits emerge universally across labs or turn out to be idiosyncratic to a particular estimation instance, potentially limiting confidence in safety-critical settings. Here, we systematically study stability across-refits in increasingly complex transformer language models of various sizes. We quantify, layer by layer, how similarly attention heads learn representations across independently initialized training runs. Our rigorous experiments show that (1) middle-layer heads are the least stable yet the most representationally distinct; (2) deeper models exhibit stronger mid-depth divergence; (3) unstable heads in deeper layers become more functionally important than their peers from the same layer; (4) applying weight decay optimization substantially improves attention-head stability across random model initializations; and (5) the residual stream is comparatively stable. Our findings establish the cross-instance robustness of circuits as an essential yet underappreciated prerequisite for scalable oversight, drawing contours around possible white-box monitorability of AI systems.
At the Edge of Understanding: Sparse Autoencoders Trace The Limits of Transformer Generalization
Pre-trained transformers have demonstrated remarkable generalization abilities, at times extending beyond the scope of their training data. … (see more)Yet, real-world deployments often face unexpected or adversarial data that diverges from training data distributions. Without explicit mechanisms for handling such shifts, model reliability and safety degrade, urging more disciplined study of out-of-distribution (OOD) settings for transformers. By systematic experiments, we present a mechanistic framework for delineating the precise contours of transformer model robustness. We find that OOD inputs, including subtle typos and jailbreak prompts, drive language models to operate on an increased number of fallacious concepts in their internals. We leverage this device to quantify and understand the degree of distributional shift in prompts, enabling a mechanistically grounded fine-tuning strategy to robustify LLMs. Expanding the very notion of OOD from input data to a model’s private computational processes—a new transformer diagnostic at inference time—is a critical step toward making AI systems safe for deployment across science, business, and government.
Causally informed, multifactorial pathways linking cognition and personality to adolescent mental health
Jiadong Yan
Bin Wan
Paule Joanne Toussaint
Judy Chen
Gleb Bezgin
Yasser Iturria-Medina
Alan Evans
Sherif Karama
Adolescence is a sensitive period for the emergence of psychopathology. During this time, physiological changes and environmental exposures … (see more)jointly shape brain development and influence cognitive and personality maturation, collectively heightening vulnerability to mental disorders. However, the complexity of interactions between these factors has hindered a systems-level understanding of mental health and the causal roles of cognition and personality in psychopathology. In this study, we proposed a multifactorial causal framework integrating brain, pubertal, environmental, and behavioral factors to characterize heterogeneity in adolescent mental health trajectories at the individual level. We then investigated latent causal pathways linking cognition and personality to mental health outcomes and identified potential personalized intervention targets. Leveraging the Adolescent Brain Cognitive Development (ABCD) dataset ( N = 4,501), we analyzed 165 behavioral pairs connecting cognition and personality traits to mental health symptoms. Using cross-sectional multivariate mediation and longitudinal interaction-inclusive analyses, we identified 68 behavioral pairs showing significant causal relationships, with brain and environmental exposures contributing to most pathways, while pubertal factors exhibited limited involvement. Individualized interpretive analyses further revealed 23 pairs suggesting potential interventions with response rates exceeding 50%. Among these, behavioral inhibition, negative urgency, and processing speed emerged as the most common intervention targets, whereas psychosis symptoms and attention problems were the most likely issues to improve. Overall, our study advances a comprehensive framework capturing the multifactorial and heterogeneous nature of adolescent mental health, delineates specific causal pathways from cognitive and personality traits to psychopathology, and provides a principled basis for potential individualized intervention strategies.
Mitochondria‐nucleus crosstalk characterizes Alzheimer's disease across 1,5 million brain cells
Emerging insight from stem cell research reinforces Alzheimer's disease (AD) to affect mitochondrial protein expression. Compelling new evid… (see more)ence points to mitochondrial reactive oxygen species (ROS) as potential driving player in Aβ toxicity, mediated through glial cells and ultimately impacting neuronal health. A comprehensive understanding of how oxidative phosphorylation variations relate to cell function remains largely unexplored, especially through a cell type lens. Leveraging today's largest single‐nucleus RNA sequencing dataset of AD, we unveil how cell‐type‐specific mitochondrial alterations reverberate in the nuclear transcriptome, in 424 AD patients and healthy controls from ROSMAP. By adopting a supervised latent factor modelling approach, we identified distinct gene modules capturing unique aspects of the mitochondrial crosstalk in 6 major brain cell types across 5,427 nuclear and 13 mitochondrial genes. We found that nuclear‐mitochondrial crosstalk varies distinctly with cell identity, reflecting metabolic demands and functional specialization. In neurons and oligodendrocytes, ATP synthase (complex V) takes a central role, whereas type 1 NADH dehydrogenase (complex I) is more prominent in astrocytes, microglia, and OPCs. Screening across >1 million gene expression profiles from ∼20,000 drug perturbations identified mitochondrial‐nuclear signatures that resemble those activated by parthenolide and niclosamide—two chemical compounds previously associated with oxidative stress and cytotoxicity via ubiquitination—as most predictive of AD. Microglia and OPCs achieved the highest overall classification accuracy, with stronger predictive performance observed in males than in females. Mapping gene module expressions to the Allen Human Brain Atlas revealed shared whole‐brain patterns highlighting the precuneus, which we implicated in ubiquitin‐cascade‐enriched modules. Clinical phenotyping revealed that males with higher AD risk, as indicated by their mitochondrial‐nuclear scores on glial gene modules, exhibited a greater pathological burden, including higher amyloid load, Parkinson's‐like symptoms, and neuroticism‐related traits. Finally, by comparing our findings with 2.5 million CRISPRi‐based perturbations, we identified neural signatures associated with female‐biased transcription factors and fatty acid biosynthesis, while glial signatures were linked to DNA damage and oxidative stress. By integrating multiple layers of biological data from established reference atlases, our analysis of mitochondria‐nuclear crosstalk revealed distinct transcriptional signatures associated with AD risk in glial and neural cells, with these associations exhibiting sex‐biased patterns.
Latent brain subtypes of chronotype reveal unique behavioral and health profiles across population cohorts
Julie Carrier
Kai-Florian Storch
Robin I. M. Dunbar
Chronotype is shaped by the complex interplay of endogenous and exogenous factors. This time-enduring trait ties into societal behaviors an… (see more)d is linked to psychiatric and metabolic conditions. Despite its multifaceted nature, prior research has treated chronotype as a monolithic trait across the population, risking overlooking substantial heterogeneity in neural and behavioral fingerprints. To uncover hidden subgroups, we develop a supervised pattern-learning framework integrating three complementary brain-imaging modalities with deep behavioral and health profiling from 27,030 UK Biobank participants. We identify five distinct, biologically valid chronotype subtypes. Each demonstrates unique patterns across brain, behavioral and health profiles. External validation in 10,550 US children from the ABCD Study cohort reveals reversed age distributions and replicates sex-associated brain-behavioral patterns, suggesting that potential divergences between chronotype traits observed throughout adulthood may begin to emerge early in life. These findings highlight underappreciated sources of population variation that echo the rhythm of people’s inner clock.
Latent brain subtypes of chronotype reveal unique behavioral and health profiles across population cohorts
Julie Carrier
Kai-Florian Storch
Robin I. M. Dunbar
Chronotype is shaped by the complex interplay of endogenous and exogenous factors. This time-enduring trait ties into societal behaviors an… (see more)d is linked to psychiatric and metabolic conditions. Despite its multifaceted nature, prior research has treated chronotype as a monolithic trait across the population, risking overlooking substantial heterogeneity in neural and behavioral fingerprints. To uncover hidden subgroups, we develop a supervised pattern-learning framework integrating three complementary brain-imaging modalities with deep behavioral and health profiling from 27,030 UK Biobank participants. We identify five distinct, biologically valid chronotype subtypes. Each demonstrates unique patterns across brain, behavioral and health profiles. External validation in 10,550 US children from the ABCD Study cohort reveals reversed age distributions and replicates sex-associated brain-behavioral patterns, suggesting that potential divergences between chronotype traits observed throughout adulthood may begin to emerge early in life. These findings highlight underappreciated sources of population variation that echo the rhythm of people’s inner clock.
Cell-type specific transcriptional modulation by psilocybin induces sustained plasticity in mouse medial prefrontal cortex
Heike Schuler
Delong Zhou
Vedrana Cvetkovska
Yiu-Chung Tse
Juliet Meccia
Joëlle Lopez
Ashot S. Harutyunyan
Jiannis Ragoussis
Rosemary C. Bagot
Despite enormous interest in psychedelics for psychiatric interventions, potential underlying biological mechanisms remain unclear. Here, we… (see more) confirm that a single dose of psilocybin increases synaptic transmission in mouse medial prefrontal cortex. Using scRNA-sequencing, we identify cell-type specific mechanisms of sustained neuroplastic effects. We show that, 24h post-psilocybin, expression of plasticity-related genes is increased in excitatory neurons and that transcription in a type of deep layer near projecting neuron, L5/6 NP, is robustly altered. Analyzing receptor expression patterns reveals that this cell-type specificity does not align with 5-HT 2A expression but aligns with 5-HT 2C expression patterns. Further, multivariate analyses identify psilocybin-induced gene expression patterns in L5/6 NP neurons predict 5-HT 2C , but not 5-HT 2A , transcript levels. Pharmacologic manipulation with a 5-HT 2C antagonist attenuates the post-acute sustained effect of psilocybin on synaptic transmission, highlighting 5-HT 2C signaling and L5/6 NP neurons as key mediators of psychedelic drug action’s sustained neuroplastic effects in mPFC.
Cognitive cartography of mammalian brains using meta-analysis of AI experts
Andrea I. Luppi
Hana Ali
Zhen-Qi Liu
Filip Milisav
Alessandro Gozzi
Bratislav Misic
The complexity of the brain is increasingly mirrored by the complexity of the neuroscientific literature, yet no individual mind can fully g… (see more)rasp the diversity of scales, methodologies and model organisms. Where human experts flag, the latest AI models excel: large language models can seamlessly integrate knowledge across scientific domains. Here we show how large language models can systematically and quantitatively synthesise literature-wide neuroscientific knowledge about the cognitive operations and dysfunctions associated with each brain region. Meta-analysis of AI experts reveals structure-function mappings to which existing meta-analytic frameworks are blind, demonstrated by lesions and direct intracranial stimulation. It also unlocks the possibility of extending quantitative literature meta-analysis and decoding of brain maps to other model organisms beyond human. As proof of concept, we integrate LLM meta-analysis with species-specific transcriptomics in human, macaque, and mouse, to discover an evolutionarily conserved molecular circuit for cognition. Altogether, meta-analysis of AI experts can fundamentally catalyze neuroscientific discovery by overcoming the barrier of data aggregation from heterogeneous studies, finally bringing together a scattered literature to identify emergent patterns and latent insights across disparate subfields, modalities, and species.
From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers
As generative AI systems become competent and democratized in science, business, and government, deeper insight into their failure modes now… (see more) poses an acute need. The occasional volatility in their behavior, such as the propensity of transformer models to hallucinate, impedes trust and adoption of emerging AI solutions in high-stakes areas. In the present work, we establish how and when hallucinations arise in pre-trained transformer models through concept representations captured by sparse autoencoders, under scenarios with experimentally controlled uncertainty in the input space. Our systematic experiments reveal that the number of semantic concepts used by the transformer model grows as the input information becomes increasingly unstructured. In the face of growing uncertainty in the input space, the transformer model becomes prone to activate coherent yet input-insensitive semantic features, leading to hallucinated output. At its extreme, for pure-noise inputs, we identify a wide variety of robustly triggered and meaningful concepts in the intermediate activations of pre-trained transformer models, whose functional integrity we confirm through targeted steering. We also show that hallucinations in the output of a transformer model can be reliably predicted from the concept patterns embedded in transformer layer activations. This collection of insights on transformer internal processing mechanics has immediate consequences for aligning AI models with human values, AI safety, opening the attack surface for potential adversarial attacks, and providing a basis for automatic quantification of a model's hallucination risk.
Enhancing decision-making in glioblastoma surgery through an explainable human-Al collaboration: an international multicenter model development and external validation study
Julius M. Kernbach
Urte Schroeder
Karlijn Hakvoort
Jonas Ort
Hussam Hamou
Yasin Temel
Pieter Kubben
Charlotte Weyland
Martin Wiesmann
Victor Staartjes
Kevin Akeret
Moira Vieli
Carlo Serra
Luca Regli
Stefan Grau
Lasse Dührsen
Franz Ricklefs
Oliver Schnell
David Ryan Ormond … (see 9 more)
Alexander Grote
Matthias Simon
Hagen Meredig
Marianne Schell
Martin Bendszus
Georg Neuloh
Hans Clusmann
Dieter-Henrik Heiland
Daniel Delev
Surgical resection improves survival in glioblastoma, yet predicting the extent of resection (EOR) remains highly challenging. We developed … (see more)and externally validated an explainable AI model to generate personalized EOR estimates in 811 glioblastoma patients undergoing microsurgical resection. EOR was categorized into gross-total (GTR), near-total (NTR), and subtotal resections (STR). An interpretable framework provided model explanations and sensitivity analyses to assess the model’s strengths and limitations. To demonstrate clinical impact, we compared the performance of the human expert (gold standard) with our AI model and a combined human-AI approach. External validation confirmed generalizability (AUC 0.78, CI 0.73-0.82). Class-specific AUCs were 0.75 (0.67-0.82) for GTR, 0.59 (0.50-0.69) for NTR, and 0.69 (0.53-0.85) for STR. Key predictors included KPS and NANO scores, age, tumor volume, and unfavorable anatomical locations. A combined human-AI collaboration outperformed human experts, with higher overall accuracies (0.53 to 0.94), F1 scores (0.30 to 0.92), and Cohen’s κ (0.41 to 0.84). Enhancing predictive performance through the clinician-AI collaboration, our explainable model supports preoperative planning and highlights the value of integrating machine intelligence into surgical decision-making.
Quantitative MRI of the hippocampus reveals microstructural trajectories of aging and Alzheimer’s disease pathology
Alfie Wearn
Christine Tardif
Ilana R. Leppert
Giulia Baracchini
Colleen Hughes
Jennifer Tremblay‐Mercier
John C.S. Breitner
Judes Poirier
Sylvia Villeneuve
Boris C. Bernhardt
Gary R. Turner
R. Nathan Spreng
Sylvia Villeneuve
Judes Poirier
John C.S. Breitner
Sylvain Baillet
Andrée‐Ann Baril
Pierre Bellec
Véronique D. Bohbot
M. Mallar Chakravarty
D. Louis Collins
Mahsa Dadar
Simon Ducharme
Alan C. Evans
Claudine Gauthier
Maiya R. Geddes
Rick Hoge
Yasser Ituria‐Medina
Gerhard Multhaup
Lisa Marie Munter
Alexa Pichet Binette
Natasha Rajah
Pedro Rosa‐Neto
Taylor W. Schmitz
Jean‐Paul Soucy
R. Nathan Spreng
Christine Tardif
Étienne Vachon‐Presseau
Christian Bocti
Maxime Descoteaux
Robert Laforce
Pierre Étienne
Serge Gauthier
Vasavan Nair
Jens C. Pruessner
Daniel Auld
Hippocampal atrophy, typically measured using volumetry, is a hallmark feature of both normal aging and Alzheimer’s disease (AD). However,… (see more) the earliest stages of atrophy manifest as microstructural changes in tissue composition rather than macroscopic volume loss. We conducted longitudinal in vivo mapping of hippocampal microstructure in healthy aging and incipient AD, highlighting demyelination, iron deposition, and changes in water content as markers of age and AD risk. A combination of macrostructural and microstructural measures provides a more comprehensive picture of brain health and disease, unlocking unique insights into the pathological state of brain tissue and the impact of AD at a point where therapeutic rescue of the tissue is most likely to be efficacious.