Learn how to leverage generative AI to support and improve your productivity at work. The next cohort will take place online on April 28 and 30, 2026, in French.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Development of a defacing algorithm to protect the privacy of head and neck cancer patients in publicly-accessible radiotherapy datasets
Single-cell technologies have transformed our understanding of cellular heterogeneity through multimodal data acquisition. However, robust c… (see more)ell alignment remains a major challenge for data integration and harmonization, including batch correction, label transfer, and multi-omics integration. Many existing methods constrain alignment based on rigid feature-wise distance metrics, limiting their ability to capture accurate cell correspondence across diverse cell populations and conditions. We introduce scGALA, a graph-based learning framework that redefines cell alignment by combining graph attention networks with a score-driven, task-independent optimization strategy. scGALA constructs enriched graphs of cell-cell relationships by integrating gene expression profiles with auxiliary information, such as spatial coordinates, and iteratively refines alignment via self-supervised graph link prediction, where a deep neural network is trained to identify and reinforce high-confidence correspondences across datasets. In extensive benchmarks, scGALA identifies over 25 percent more high-confidence alignments without compromising accuracy. By improving the core step of cell alignment, scGALA serves as a versatile enhancer for a wide range of single-cell data integration tasks.
We introduce a neural approach to dynamical modeling of galaxies that replaces traditional imaging-based deprojections with a differentiable… (see more) mapping. Specifically, we train a neural network to translate Nuker profile parameters into analytically deprojectable Multi Gaussian Expansion components, enabling physically realistic stellar mass models without requiring optical observations. We integrate this model into SuperMAGE, a differentiable dynamical modelling pipeline for Bayesian inference of supermassive black hole masses. Applied to ALMA data, our approach finds results consistent with state-of-the-art models while extending applicability to dust-obscured and active galaxies where optical data analysis is challenging.
MicroRNAs (miRNAs) are small non-coding RNAs that regulate genes by binding to target messenger RNAs (mRNAs), causing them to degrade or sup… (see more)pressing their translation. Accurate prediction of miRNA–mRNA interactions is crucial for RNA therapeutics. Existing methods rely on handcrafted features, struggle to scale to kilobase-long mRNA sequences, or lack interpretability. We introduce
MiRformer
, a transformer framework designed to predict not only the binary miRNA–mRNA interaction but also the start and end location of the miRNA binding site in the mRNA sequence. MiRformer employs a dual-transformer encoder architecture to learn interaction patterns directly from raw miRNA-mRNA sequence pairs via the cross-attention between the miRNA-encoder and mRNA-encoder. To scale to long mRNA sequences, we leverage
sliding-window attention
mechanism. MiR-former achieves state-of-the-art performance across diverse miRNA–mRNA tasks, including binding prediction, target-site localization, and cleavage-site identification from Degradome sequencing data. The learned transformer attention are highly interpretable and reveals highly contrasting signals for the miRNA seed regions in 500-nt long mRNA sequences. We used MiRformer to simultaneously predict novel binding sites and cleavage sites in 13k miRNA-mRNA pairs and observed that the two types of sites tend to be close to each other, supporting miRNA-mediated degradation mechanism. Our code is available at
https://github.com/li-lab-mcgill/miRformer
.
We introduce a novel framework for upsampled Point Spread Function (PSF) modeling using pixel-level Bayesian inference. Accurate PSF charact… (see more)erization is critical for precision measurements in many fields including: weak lensing, astrometry, and photometry. Our method defines the posterior distribution of the pixelized PSF model through the combination of an analytic Gaussian likelihood and a highly expressive generative diffusion model prior, trained on a library of HST ePSF templates. Compared to traditional methods (parametric Moffat, ePSF template-based, and regularized likelihood), we demonstrate that our PSF models achieve orders of magnitude higher likelihood and residuals consistent with noise, all while remaining visually realistic. Further, the method applies even for faint and heavily masked point sources, merely producing a broader posterior. By recovering a realistic, pixel-level posterior distribution, our technique enables the first meaningful propagation of detailed PSF morphological uncertainty in downstream analysis. An implementation of our posterior sampling procedure is available on GitHub.
Electronic prospective surveillance models (ePSMs) have the potential to improve the management of cancer-related impairments by systematica… (see more)lly screening patients using electronic patient-reported outcomes during and after treatment, and linking them to tailored self-management resources and rehabilitation programs. However, their successful implementation into routine care requires careful consideration of patient and provider needs and must align with clinical workflows, which may vary across settings and require adaptation to the local context. The aim of this paper is to describe the development of REACH, a web-based ePSM designed to remotely screen for physical cancer–related impairments and direct patients to rehabilitation resources based on need. The development of REACH followed an integrated knowledge translation (iKT) approach, engaging key knowledge users including patients, clinicians, administrators, and information technology specialists. The development process involved collaboration across 5 working groups. The system content and logic group selected the impairments to be screened, measures used, frequency of screening, and resources recommended based on results of a survey with oncology providers and researchers, patient feedback, a literature review, and an environmental scan. The machine learning group explored predictive modeling approaches to optimize the assessment frequency using retrospective patient data. The implementation group identified features from existing systems that could be built to promote assessment completion and integration into clinical workflows through a scoping review, interviews with clinic staff, and focus groups with patients. The design group conducted co-design workshops and usability testing with patients to iteratively refine the interface and develop a prototype. Finally, the software development group converted the prototype to a web-based application and conducted privacy and security assessments and quality assurance. The integration of key knowledge users through an iKT approach played a critical role in determining the design and functionality of REACH. REACH allows patients to remotely complete assessments tailored to their cancer type and treatment status on any electronic device. The system generates automated advice based on the assessment responses, including links to educational resources for self-management, suggestions for community programs to register for, and recommendations to contact their oncology team for further assessment and possible referral to rehabilitation services. These recommended resources are stored in the patient’s personalized library, organized by type and severity of cancer-related impairments reported, and are updated following each new electronic patient-reported outcomes assessment completed. Additional key system features include a patient-driven and structured process for managing high impairment scores, usability enhancements to improve navigation, and safeguards to ensure data security. The development of REACH demonstrates how an iKT approach can be used to design an ePSM that is user-friendly, clinically relevant, and aligned with implementation considerations. The system has been implemented at 4 Canadian cancer centers, and its implementation is being evaluated to inform future refinements.
Sampling multiple outputs from a Large Language Model (LLM) and selecting the most frequent (Self-consistency) or highest-scoring (Best-of-N… (see more)) candidate is a popular approach to achieve higher accuracy in tasks with discrete final answers. Best-of-N (BoN) selects the output with the highest reward, and with perfect rewards, it often achieves near-perfect accuracy. With imperfect rewards from reward models, however, BoN fails to reliably find the correct answer and its performance degrades drastically. We consider the distribution of BoN's outputs and highlight that, although the correct answer does not usually have a probability close to one under imperfect rewards, it is often the most likely outcome. This suggests that the mode of this distribution can be more reliably correct than a sample from it. Based on this idea, we propose Majority-of-the-Bests (MoB), a novel selection mechanism that estimates the output distribution of BoN via bootstrapping and selects its mode. Experimental results across five benchmarks, three different base LLMs, and two reward models demonstrate consistent improvements over BoN in 25 out of 30 setups. We also provide theoretical results for the consistency of the bootstrapping. MoB serves as a simple, yet strong alternative to BoN and self-consistency, and more broadly, motivates further research in more nuanced selection mechanisms.
While the beneficial impacts of meditation are increasingly acknowledged, its underlying neural mechanisms remain poorly understood. We exam… (see more)ined the electrophysiological brain signals of expert Buddhist monks during two established meditation methods known as Samatha and Vipassana, which employ focused attention and open-monitoring technique. By combining source-space magnetoencephalography with advanced signal processing and machine learning tools, we provide an unprecedented assessment of the role of brain oscillations, complexity, and criticality in meditation. In addition to power spectral density, we computed long-range temporal correlations (LRTC), deviation from criticality coefficient (DCC), Lempel–Ziv complexity, 1/f slope, Higuchi fractal dimension, and spectral entropy. Our findings indicate increased levels of neural signal complexity during both meditation practices compared to the resting state, alongside widespread reductions in gamma-band LRTC and 1/f slope. Importantly, the DCC analysis revealed a separation between Samatha and Vipassana, suggesting that their distinct phenomenological properties are mediated by specific computational characteristics of their dynamic states. Furthermore, in contrast to most previous reports, we observed a decrease in oscillatory gamma power during meditation, a divergence likely due to the correction of the power spectrum by the 1/f slope, which could reduce potential confounds from broadband 1/f activity. We discuss how these results advance our comprehension of the neural processes associated with focused attention and open-monitoring meditation practices.
The Biotuner Toolbox is an open-source Python toolbox for biosignals that integrates concepts from neuroscience, music theory, and signal pr… (see more)ocessing. It introduces a harmonic perspective on physiological oscillations by applying musical constructs such as consonance, rhythm, and scale construction. The core biotuner_object processes neural, cardiac, and auditory time series, providing a unified interface for extracting spectral peaks, computing harmonicity metrics, and supporting downstream analyses. Companion modules extend harmonic analyses across temporal (time-resolved harmonicity), spatial (harmonic connectivity), and spectral (harmonic spectrum) dimensions. Biotuner identifies harmonic structure across different biosignals, revealing significant variations in harmonicity between physiological states. Specifically, the toolbox extracts spectral peaks from complex signals using multiple algorithms, ensuring robust peak detection under varying signal-to-noise ratios. Moreover, we show how harmonicity metrics change across distinct sleep stages and capture variations in the slopes of the aperiodic (1/f) component of the power spectrum. Biotuner provides an extensible framework that unifies music-theoretic constructs with biosignal processing, enabling hypothesis-driven analyses for researchers and, in parallel, creative exploration of complex natural patterns for artists.