Publications

scGALA advances graph link prediction-based cell alignment for comprehensive data integration and harmonization

Guo Jiang

Kailu Song

Gregory J. Fonseca

Darcy E. Wagner

Iain C. Clark

Hui Wang

Single-cell technologies have transformed our understanding of cellular heterogeneity through multimodal data acquisition. However, robust c… (see more)ell alignment remains a major challenge for data integration and harmonization, including batch correction, label transfer, and multi-omics integration. Many existing methods constrain alignment based on rigid feature-wise distance metrics, limiting their ability to capture accurate cell correspondence across diverse cell populations and conditions. We introduce scGALA, a graph-based learning framework that redefines cell alignment by combining graph attention networks with a score-driven, task-independent optimization strategy. scGALA constructs enriched graphs of cell-cell relationships by integrating gene expression profiles with auxiliary information, such as spatial coordinates, and iteratively refines alignment via self-supervised graph link prediction, where a deep neural network is trained to identify and reinforce high-confidence correspondences across datasets. In extensive benchmarks, scGALA identifies over 25 percent more high-confidence alignments without compromising accuracy. By improving the core step of cell alignment, scGALA serves as a versatile enhancer for a wide range of single-cell data integration tasks.

2025-11-25

Nature Communications (published)

doi.org

Neural Deprojection of Galaxy Stellar Mass Profiles

M. J. Yantovski-Barth

Hengyue Zhang

Nolan Smyth

Connor Stone

Martin Bureau

Yashar Hezaveh

Laurence Perreault-Levasseur

We introduce a neural approach to dynamical modeling of galaxies that replaces traditional imaging-based deprojections with a differentiable… (see more) mapping. Specifically, we train a neural network to translate Nuker profile parameters into analytically deprojectable Multi Gaussian Expansion components, enabling physically realistic stellar mass models without requiring optical observations. We integrate this model into SuperMAGE, a differentiable dynamical modelling pipeline for Bayesian inference of supermassive black hole masses. Applied to ALMA data, our approach finds results consistent with state-of-the-art models while extending applicability to dust-obscured and active galaxies where optical data analysis is challenging.

2025-11-24

arXiv (preprint)

doi.org

arxiv.org

Mind the Information Gap: Unveiling Detailed Morphologies of z 0.5-1.0 Galaxies with SLACS Strong Lenses and Data-Driven Analysis

Ronan Legin

Connor Stone

Alexandre Adam

Gabriel Missael Barco

Adam Coogan

Nikolay Malkin

Laurence Perreault-Levasseur

Yashar Hezaveh

2025-11-23

ArXiv (preprint)

arxiv.org

MiRformer: a dual-transformer-encoder framework for predicting microRNA-mRNA interactions from paired sequences

Jiayao Gu

Can (Sam) Chen

Yue Li

MicroRNAs (miRNAs) are small non-coding RNAs that regulate genes by binding to target messenger RNAs (mRNAs), causing them to degrade or sup… (see more)pressing their translation. Accurate prediction of miRNA–mRNA interactions is crucial for RNA therapeutics. Existing methods rely on handcrafted features, struggle to scale to kilobase-long mRNA sequences, or lack interpretability. We introduce MiRformer , a transformer framework designed to predict not only the binary miRNA–mRNA interaction but also the start and end location of the miRNA binding site in the mRNA sequence. MiRformer employs a dual-transformer encoder architecture to learn interaction patterns directly from raw miRNA-mRNA sequence pairs via the cross-attention between the miRNA-encoder and mRNA-encoder. To scale to long mRNA sequences, we leverage sliding-window attention mechanism. MiR-former achieves state-of-the-art performance across diverse miRNA–mRNA tasks, including binding prediction, target-site localization, and cleavage-site identification from Degradome sequencing data. The learned transformer attention are highly interpretable and reveals highly contrasting signals for the miRNA seed regions in 500-nt long mRNA sequences. We used MiRformer to simultaneously predict novel binding sites and cleavage sites in 13k miRNA-mRNA pairs and observed that the two types of sites tend to be close to each other, supporting miRNA-mediated degradation mechanism. Our code is available at https://github.com/li-lab-mcgill/miRformer .

2025-11-23

bioRxiv (preprint)

doi.org

Pixellated Posterior Sampling of Point Spread Functions in Astronomical Images

Gabriel Missael Barco

Laurence Perreault-Levasseur

Yashar Hezaveh

We introduce a novel framework for upsampled Point Spread Function (PSF) modeling using pixel-level Bayesian inference. Accurate PSF charact… (see more)erization is critical for precision measurements in many fields including: weak lensing, astrometry, and photometry. Our method defines the posterior distribution of the pixelized PSF model through the combination of an analytic Gaussian likelihood and a highly expressive generative diffusion model prior, trained on a library of HST ePSF templates. Compared to traditional methods (parametric Moffat, ePSF template-based, and regularized likelihood), we demonstrate that our PSF models achieve orders of magnitude higher likelihood and residuals consistent with noise, all while remaining visually realistic. Further, the method applies even for faint and heavily masked point sources, merely producing a broader posterior. By recovering a realistic, pixel-level posterior distribution, our technique enables the first meaningful propagation of detailed PSF morphological uncertainty in downstream analysis. An implementation of our posterior sampling procedure is available on GitHub.

2025-11-23

ArXiv (preprint)

arxiv.org

Publisher Correction: On the compatibility of generative AI and generative linguistics

Eva Portelance

Masoud Jasbi

2025-11-23

Nature Computational Science (published)

doi.org

Use of an Integrated Knowledge Translation Approach to Develop an Electronic Patient-Reported Outcome System for Cancer Rehabilitation: Tutorial

Christian Lopez

Sarah E Neil-Sztramko

Kristin L Campbell

David M Langelier

Tran Truong

Yuliya Gavrylyuk

Pia Nyakairu

Laura Parente

Audrey Durand

Jackie L Bender

Gillian Strudwick

Rupali Bhati

Jonathan Greenland

Tony Reiman

Jennifer M Jones

Electronic prospective surveillance models (ePSMs) have the potential to improve the management of cancer-related impairments by systematica… (see more)lly screening patients using electronic patient-reported outcomes during and after treatment, and linking them to tailored self-management resources and rehabilitation programs. However, their successful implementation into routine care requires careful consideration of patient and provider needs and must align with clinical workflows, which may vary across settings and require adaptation to the local context. The aim of this paper is to describe the development of REACH, a web-based ePSM designed to remotely screen for physical cancer–related impairments and direct patients to rehabilitation resources based on need. The development of REACH followed an integrated knowledge translation (iKT) approach, engaging key knowledge users including patients, clinicians, administrators, and information technology specialists. The development process involved collaboration across 5 working groups. The system content and logic group selected the impairments to be screened, measures used, frequency of screening, and resources recommended based on results of a survey with oncology providers and researchers, patient feedback, a literature review, and an environmental scan. The machine learning group explored predictive modeling approaches to optimize the assessment frequency using retrospective patient data. The implementation group identified features from existing systems that could be built to promote assessment completion and integration into clinical workflows through a scoping review, interviews with clinic staff, and focus groups with patients. The design group conducted co-design workshops and usability testing with patients to iteratively refine the interface and develop a prototype. Finally, the software development group converted the prototype to a web-based application and conducted privacy and security assessments and quality assurance. The integration of key knowledge users through an iKT approach played a critical role in determining the design and functionality of REACH. REACH allows patients to remotely complete assessments tailored to their cancer type and treatment status on any electronic device. The system generates automated advice based on the assessment responses, including links to educational resources for self-management, suggestions for community programs to register for, and recommendations to contact their oncology team for further assessment and possible referral to rehabilitation services. These recommended resources are stored in the patient’s personalized library, organized by type and severity of cancer-related impairments reported, and are updated following each new electronic patient-reported outcomes assessment completed. Additional key system features include a patient-driven and structured process for managing high impairment scores, usability enhancements to improve navigation, and safeguards to ensure data security. The development of REACH demonstrates how an iKT approach can be used to design an ePSM that is user-friendly, clinically relevant, and aligned with implementation considerations. The system has been implemented at 4 Canadian cancer centers, and its implementation is being evaluated to inform future refinements.

2025-11-23

JMIR Cancer (published)

doi.org

Majority of the Bests: Improving Best-of-N via Bootstrapping

Amin Rakhsha

Kanika Madan

Tianyu Zhang

Amir-massoud Farahmand

Amir Khasahmadi

Sampling multiple outputs from a Large Language Model (LLM) and selecting the most frequent (Self-consistency) or highest-scoring (Best-of-N… (see more)) candidate is a popular approach to achieve higher accuracy in tasks with discrete final answers. Best-of-N (BoN) selects the output with the highest reward, and with perfect rewards, it often achieves near-perfect accuracy. With imperfect rewards from reward models, however, BoN fails to reliably find the correct answer and its performance degrades drastically. We consider the distribution of BoN's outputs and highlight that, although the correct answer does not usually have a probability close to one under imperfect rewards, it is often the most likely outcome. This suggests that the mode of this distribution can be more reliably correct than a sample from it. Based on this idea, we propose Majority-of-the-Bests (MoB), a novel selection mechanism that estimates the output distribution of BoN via bootstrapping and selects its mode. Experimental results across five benchmarks, three different base LLMs, and two reward models demonstrate consistent improvements over BoN in 25 out of 30 setups. We also provide theoretical results for the consistency of the bootstrapping. MoB serves as a simple, yet strong alternative to BoN and self-consistency, and more broadly, motivates further research in more nuanced selection mechanisms.

2025-11-22

ArXiv (preprint)

doi.org

openreview.net

Meditation induces shifts in neural oscillations, brain complexity, and critical dynamics: novel insights from MEG

Annalisa Pascarella

Philipp Thölke

David Meunier

Jordan O'Byrne

Tarek Lajnef

Antonino Raffone

Roberto Guidotti

Vittorio Pizzella

Laura Marzetti

Karim Jerbi

While the beneficial impacts of meditation are increasingly acknowledged, its underlying neural mechanisms remain poorly understood. We exam… (see more)ined the electrophysiological brain signals of expert Buddhist monks during two established meditation methods known as Samatha and Vipassana, which employ focused attention and open-monitoring technique. By combining source-space magnetoencephalography with advanced signal processing and machine learning tools, we provide an unprecedented assessment of the role of brain oscillations, complexity, and criticality in meditation. In addition to power spectral density, we computed long-range temporal correlations (LRTC), deviation from criticality coefficient (DCC), Lempel–Ziv complexity, 1/f slope, Higuchi fractal dimension, and spectral entropy. Our findings indicate increased levels of neural signal complexity during both meditation practices compared to the resting state, alongside widespread reductions in gamma-band LRTC and 1/f slope. Importantly, the DCC analysis revealed a separation between Samatha and Vipassana, suggesting that their distinct phenomenological properties are mediated by specific computational characteristics of their dynamic states. Furthermore, in contrast to most previous reports, we observed a decrease in oscillatory gamma power during meditation, a divergence likely due to the correction of the power spectrum by the 1/f slope, which could reduce potential confounds from broadband 1/f activity. We discuss how these results advance our comprehension of the neural processes associated with focused attention and open-monitoring meditation practices.

2025-11-22

Neuroscience of Consciousness (published)

doi.org

Biotuner: A python toolbox integrating music theory and signal processing for harmonic analysis of physiological and natural time series

Antoine Bellemare-Pepin

Karim Jerbi

The Biotuner Toolbox is an open-source Python toolbox for biosignals that integrates concepts from neuroscience, music theory, and signal pr… (see more)ocessing. It introduces a harmonic perspective on physiological oscillations by applying musical constructs such as consonance, rhythm, and scale construction. The core biotuner_object processes neural, cardiac, and auditory time series, providing a unified interface for extracting spectral peaks, computing harmonicity metrics, and supporting downstream analyses. Companion modules extend harmonic analyses across temporal (time-resolved harmonicity), spatial (harmonic connectivity), and spectral (harmonic spectrum) dimensions. Biotuner identifies harmonic structure across different biosignals, revealing significant variations in harmonicity between physiological states. Specifically, the toolbox extracts spectral peaks from complex signals using multiple algorithms, ensuring robust peak detection under varying signal-to-noise ratios. Moreover, we show how harmonicity metrics change across distinct sleep stages and capture variations in the slopes of the aperiodic (1/f) component of the power spectrum. Biotuner provides an extensible framework that unifies music-theoretic constructs with biosignal processing, enabling hypothesis-driven analyses for researchers and, in parallel, creative exploration of complex natural patterns for artists.

2025-11-20

Brain Informatics (published)

doi.org

Leveraging a Fully Differentiable Integrated Assessment Model for RL and Inference

Koen Ponse

Kai-Hendrik Cohrs

Phillip Wozny

Andrew Robert Williams

Tianyu Zhang

Erman Acar

Yoshua Bengio

Aske Plaat

Thomas M. Moerland

Pierre Gentine

Gustau Camps-Valls

2025-11-20

EurIPS.cc/2025/Workshop/DiffSys (published)

openreview.net

Determinants of pleiotropy and monotonic gene dosage responses across human traits

Sayeh Kazem

Kuldeep Kumar

Guillaume Huguet

Josephine Mollon

Thomas Renne

Laura M. Schultz

Emma E.M. Knowles

Worrawat Engchuan

Omar Shanta

Bhooma Thiruvahindrapuram

Jeffrey R. MacDonald

Celia M. T. Greenwood

Stephen W. Scherer

Laura Almasy

Jonathan Sebat

David C. Glahn

Guillaume Dumas

Sébastien Jacquemont

While pleiotropic effects of gene dosage are of particular relevance for comorbidities observed in the developmental pediatric and psychiatr… (see more)ic clinic, the biological processes underlying such pleiotropy remain unknown. We developed a new functional burden analysis (FunBurd) to investigate all CNVs, genome-wide, beyond well-studied recurrent CNVs. In ~500,000 UK-Biobank participants, we tested the association between 43 traits and CNVs disrupting 172 tissue or cell-type gene-sets. CNVs affected all traits. Pleiotropy was correlated with genetic constraint and was higher in the brain compared to non-brain functions, even after normalizing for genetic constraint. The levels of pleiotropy, measured by burden correlation, were similar in deletions and loss-of-function SNVs and higher compared to common variants and duplications. Gene sets under high genetic constraint showed less monotonic gene dosage responses across traits. Even in the absence of a monotonic response, we observed a negative correlation between deletion and duplication effect sizes across most traits. Overall, functional gene sets are preferentially associated with a given trait when either deleted or duplicated, but rarely both.

2025-11-18

Research Square (preprint)

doi.org

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Publications

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Popular keywords:

Publications