Publications

Quantized Disentanglement: A Practical Approach

Vitória Barin-Pacela

Kartik Ahuja

Simon Lacoste-Julien

P Vincent

2025-06-08

ICML.cc/2025/Workshop/SIM (poster)

openreview.net

Revisiting the Goldilocks Zone in Inhomogeneous Networks

Zacharie Garnier Cuchet

A. Chandar

Ekaterina Lobacheva

We investigate how architectural inhomogeneities—such as biases, layer normalization, and residual connections—affect the curvature of t… (see more)he loss landscape at initialization and its link to trainability. We focus on the Goldilocks zone, a region in parameter space with excess positive curvature, previously associated with improved optimization in homogeneous networks. To extend this analysis, we compare two scaling strategies: weight scaling and softmax temperature scaling. Our results show that in networks with biases or residual connections, both strategies identify a Goldilocks zone aligned with better training. In contrast, layer normalization leads to lower or negative curvature, yet stable optimization—revealing a disconnect between curvature and trainability. Softmax temperature scaling behaves more consistently across models, making it a more robust probe. Overall, the Goldilocks zone remains relevant in inhomogeneous networks, but its geometry and predictive power depend on architectural choices, particularly normalization.

2025-06-08

ICML.cc/2025/Workshop/HiLD (poster)

openreview.net

Spaced Scheduling for Large Language Model Training

Amine El hattami

Nicolas Chapados

Christopher Pal

2025-06-08

TMLR (accepted)

openreview.net

TGM: A Modular Framework for Machine Learning on Temporal Graphs

Michael M. Bronstein

Matthias Fey

While deep learning on static graphs has been revolutionized by standardized libraries like PyTorch Geometric and DGL, machine learning on T… (see more)emporal Graphs (TG), networks that evolve over time, lacks comparable software infrastructure. Existing TG libraries are limited in scope, focusing on a single method category or specific algorithms. We introduce Temporal Graph Modelling (TGM), a comprehensive framework for machine learning on temporal graphs to address this gap. Through a modular architecture, TGM is the first library to support both discrete and continuous-time TG methods and implements a wide range of TG methods. The TGM framework combines an intuitive front-end API with an optimized backend storage, enabling reproducible research and efficient experimentation at scale. Key features include graph-level optimizations for offline training and built-in performance profiling capabilities. Through extensive benchmarking on five real-world networks, TGM is up to 6 times faster than the widely used DyGLib library on TGN and TGAT models and up to 8 times faster than the UTG framework for converting edges into coarse-grained snapshots.

2025-06-08

ICML.cc/2025/Workshop/CODEML (published)

openreview.net

Towards Fair In-Context Learning with Tabular Foundation Models

Patrik Joslin Kenfack

S Ebrahimi Kahou

Ulrich Matchi Aïvodji

Tabular foundational models have shown promising in-context learning capabilities on structured data by using training examples as context w… (see more)ithout further parameter adjustments. This emerging approach positions itself as a competitive alternative to traditional gradient-boosted tree methods. However, while biases in conventional machine learning models are well documented, it remains unclear how these biases manifest in Tabular ICL. The paper investigates the fairness implications of Tabular ICL and explores three preprocessing strategies—correlation removal, group-balanced demonstration selection, and uncertainty-based demonstration selection—to address bias. Comprehensive experiments indicate that uncertainty-based demonstration selection consistently enhances group fairness in the predictions. The source code for reproducing the results of this work can be found at https://anonymous.4open.science/r/Fair-TabICL-DD84.

2025-06-08

ICML.cc/2025/Workshop/FMSD (published)

doi.org

openreview.net

Two-point deterministic equivalence for SGD in random feature models

Alexander Atanasov

Blake Bordelon

Jacob A Zavatone-Veth

Courtney Paquette

Cengiz Pehlevan

2025-06-08

ICML.cc/2025/Workshop/HiLD (poster)

openreview.net

Ultrasound and MRI-based evaluation of relationships between morphological and mechanical properties of the lower lumbar multifidus muscle in chronic low back pain

Neda Naghdi

Sara Masi

Cleo Bertrand

Brent Rosenstein

Julien Cohen-Adad

Hassan Rivaz

Mathieu Roy

Maryse Fortin

While lumbar multifidus (MF) muscle alterations are linked to low back pain (LBP), the structure-function relationship is not fully understo… (see more)od. This study aims to evaluate the relationship between fatty degeneration of the lumbar MF muscle and its function in individuals with and without LBP. The study included 25 participants with chronic nonspecific LBP and 25 age- and sex-matched healthy controls. Participants underwent MRI assessment for MF fat infiltration, utilizing IDEAL fat-water images. Ultrasound measures evaluated MF function, including shear-wave elastography (SWE) for stiffness/elasticity and thickness ratio from rest to submaximal contraction. All measurements were acquired at L4/L5 and L5/S1 spinal levels, bilaterally. Bivariate and multivariable linear regression models were used to assess the relationship between morphology and function, while age, sex, body max index (BMI), physical activity levels, and LBP status were considered as covariates. Fifty participants (26 females) were included (mean age: 39.22 ± 11.67). Greater % MF fat at L4/L5 was significantly associated with greater MF SWE ratio (p = 0.002). No significant bivariate or multivariable relationships were found between MF fat infiltration and MF thickness ratio. Participants with LBP exhibited lower contraction ratios (p = 0.017) and higher SWE during contraction (p = 0.03) at L4/L5 compared to controls. This study highlights a positive association between MF fat infiltration and SWE-based stiffness measures at L4/L5, suggesting altered muscle composition may impacts MF function. However, no relationship was found between MF fat infiltration and contraction. Participants with LBP demonstrated distinct deficits in muscle activation, supporting the need for targeted rehabilitation strategies addressing these functional impairments.

2025-06-08

European Spine Journal (published)

doi.org

Multi-Priority Scheduling for Traffic Management in Future Scalable Payloads

Zineb Garroussi

Olfa Ben Yahia

Brunilde Sansò

Jean-François Frigon

Stéphane Martel

Guillaume Mantelet

Antoine Lesage-Landry

Gunes Karabulut Kurt

Through multibeam, frequency reuse, and advanced antenna technology, regenerative non-geostationary orbit (NGSO) extremely high-throughput s… (see more)atellites (EHTS) are expected to play a key role in future communications, delivering data rates up to terabits per second. This paper investigates a novel architecture for future regenerative and scalable payloads to satisfy users’ demands for varying quality of service (QoS). This architecture is designed based on multiple modem banks and requires a new flow assignment strategy to efficiently route traffic within the satellite. We propose a multi-commodity path flow optimization problem to manage the load with varying QoS requirements across multiple modems within an NGSO high-throughput satellite (HTS) system and beyond. The simulation results demonstrate that the proposed model consistently maintains low delays and packet losses for the highest-priority traffic and outperforms the classical first-in, first-out (FIFO) approach.

2025-06-07

2025 IEEE International Conference on Communications Workshops (ICC Workshops) (published)

doi.org

Silent Sabotage: Injecting Backdoors into AI Agents Through Fine-Tuning

Léo Boisvert

Abhay Puri

Chandra Kiran Reddy Evuru

Joshua Kazdan

Avinandan Bose

Quentin Cappart

Maryam Fazel

Sai Rajeswar

Jason Stanley

Nicolas Chapados

Alexandre Drouin

Krishnamurthy Dj Dvijotham

The rise of AI agents that can use tools, browse the web and interact with computers on behalf of a user, has sparked strong interest in imp… (see more)roving these capabilities by explicitly fine-tuning the LLMs/VLMs that power these agents. Several researchers have proposed collecting data by letting the agents interact with their environment (e.g., a computer operating system, the web or a collection of APIs exposed as tools), and improve agent performance by fine tuning on this data. In this work, we show that such data collection can be manipulated by adversaries to insert poisoned traces. By modifying just 5% of collected traces, adversaries can embed stealthy bad behaviors into agents—like leaking confidential user information whenever the tool or webpage exposes a trigger. Our results raise important security concerns in the development of AI agents, and underscore the importance of careful scrutiny of all data collection processes used to improve agentic AI.

2025-06-07

ICML.cc/2025/Workshop/WCUA (poster)

openreview.net

A Self-Supervised Foundation Model for Robust and Generalizable Representation Learning in STED Microscopy

Anthony Bilodeau

Frédéric Beaupré

Julia Chabbert

Kamylle Thériault

Andréanne Deschênes

Jean-Michel Bellavance

Koraly Lessard

Renaud Bernatchez

Paul De Koninck

Christian Gagné

Flavie Lavoie-Cardinal

Foundation Models (FMs) have dramatically increased the potential and power of deep learning algorithms through general capacities over a va… (see more)riety of tasks. The performance increase they offer is obtained without elaborated specific trainings for domains such as natural language processing and computer vision. However, their application in specialized fields like biomedical imaging and fluorescence microscopy remains difficult due to distribution shifts and the scarcity of high-quality annotated datasets. The high cost of data acquisition and the requirement for in-domain expertise further exacerbate this challenge in microscopy. To address this we introduce STED-FM, a foundation model specifically designed for super-resolution STimulated Emission Depletion (STED) microscopy. STED-FM leverages a Vision Transformer architecture trained at scale with Masked Autoencoding on a new dataset of nearly one million STED images. STED-FM learns expressive latent representations without requiring extensive annotations, yielding robust performance across diverse downstream microscopy image analysis tasks. Unsupervised experiments demonstrate the discriminative structure of its learned latent space. These representations can be leveraged for multiple downstream applications, including fully supervised classification and segmentation with reduced annotation requirements. Moreover, STED-FM representations enhance the performance of deep learning–based image denoising and improve the quality of images generated by diffusion models, enabling latent attribute manipulation for the data-driven discovery of subtle nanostructures and phenotypes, as well as algorithmic super-resolution. Moreover, its powerful structure retrieval capabilities are integrated into automated STED microscopy acquisition pipelines, paving the way for smart microscopy. In sum, we demonstrate that STED-FM lays a robust foundation for state-of-the-art algorithms across a wide array of tasks, establishing it as a highly valuable and scalable resource for researchers in super-resolution microscopy.

2025-06-05

bioRxiv (preprint)

doi.org

Use of Artificial Intelligence in Adolescents' Mental Health Care: Systematic Scoping Review of Current Applications and Future Directions

Gauri Sharma

Mark J Yaffe

Pooria Ghadiri

Rushali Gandhi

Laura Pinkham

Genevieve Gore

Samira Abbasgholizadeh-Rahimi

Given the increasing prevalence of mental health problems among adolescents, early intervention and appropriate management are needed to dec… (see more)rease mortality and morbidity. Artificial intelligence’s (AI) potential contributions, although significant in the field of medicine, have not been adequately studied in the context of adolescents’ mental health. This review aimed to identify AI interventions that have been tested, implemented, or both, for use in adolescents’ mental health care. We used the Arksey and O’Malley framework, further refined by Levac et al, along with the Joanna Briggs Institute methodology, to guide this scoping review. We searched 5 electronic databases from the inception date through July 2024 (inclusive). Four independent reviewers screened the titles and abstracts, read the full texts, and extracted data using a validated data extraction form. Disagreements were resolved by consensus, and if this was not possible, the opinion of a fifth reviewer was sought. We evaluated the risk of bias (ROB) for prognosis and diagnosis-related studies using the Prediction Model Risk of Bias Assessment Tool. We followed the PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) checklist for reporting. Of the papers screened, 88 papers relevant to our eligibility criteria were identified. Among the included papers, AI was most commonly used for diagnosis (n=78), followed by monitoring and evaluation (n=19), treatment (n=10), and prognosis (n=6). As some studies addressed multiple applications, categories are not mutually exclusive. For diagnosis, studies primarily addressed suicidal behaviors (n=11) and autism spectrum disorder (n=7). Machine learning was the most frequently reported AI method across all application areas. The overall ROB for diagnostic and prognostic models was predominantly unclear (58%), while 20% of studies had a high ROB and 22% were assessed as low risk. In our review, we found that AI is being applied across various areas of adolescent mental health care, spanning diagnosis, treatment planning, symptom monitoring, and prognosis. Interestingly, most studies to date have concentrated heavily on diagnostic tools, leaving other important aspects of care relatively underexplored. This presents a key opportunity for future research to broaden the scope of AI applications beyond diagnosis. Moreover, future studies should emphasize the meaningful and active involvement of end users in the design, development, and validation of AI interventions, alongside improved transparency in reporting AI models, data handling, and analytical processes to build trust and support safe clinical implementation.

2025-06-05

JMIR Mental Health (published)

doi.org

Use of Artificial Intelligence in Adolescents’ Mental Health Care: Systematic Scoping Review of Current Applications and Future Directions

Gauri Sharma

Mark J Yaffe

Pooria Ghadiri

Rushali Gandhi

Laura Pinkham

Genevieve Gore

Samira Abbasgholizadeh-Rahimi

Abstract Background Given the increasing prevalence of mental health problems among adolescents, early intervention and appropriate manageme… (see more)nt are needed to decrease mortality and morbidity. Artificial intelligence’s (AI) potential contributions, although significant in the field of medicine, have not been adequately studied in the context of adolescents’ mental health. Objective This review aimed to identify AI interventions that have been tested, implemented, or both, for use in adolescents’ mental health care. Methods We used the Arksey and O’Malley framework, further refined by Levac et al, along with the Joanna Briggs Institute methodology, to guide this scoping review. We searched 5 electronic databases from the inception date through July 2024 (inclusive). Four independent reviewers screened the titles and abstracts, read the full texts, and extracted data using a validated data extraction form. Disagreements were resolved by consensus, and if this was not possible, the opinion of a fifth reviewer was sought. We evaluated the risk of bias (ROB) for prognosis and diagnosis-related studies using the Prediction Model Risk of Bias Assessment Tool. We followed the PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) checklist for reporting. Results Of the papers screened, 88 papers relevant to our eligibility criteria were identified. Among the included papers, AI was most commonly used for diagnosis (n=78), followed by monitoring and evaluation (n=19), treatment (n=10), and prognosis (n=6). As some studies addressed multiple applications, categories are not mutually exclusive. For diagnosis, studies primarily addressed suicidal behaviors (n=11) and autism spectrum disorder (n=7). Machine learning was the most frequently reported AI method across all application areas. The overall ROB for diagnostic and prognostic models was predominantly unclear (58%), while 20% of studies had a high ROB and 22% were assessed as low risk. Conclusions In our review, we found that AI is being applied across various areas of adolescent mental health care, spanning diagnosis, treatment planning, symptom monitoring, and prognosis. Interestingly, most studies to date have concentrated heavily on diagnostic tools, leaving other important aspects of care relatively underexplored. This presents a key opportunity for future research to broaden the scope of AI applications beyond diagnosis. Moreover, future studies should emphasize the meaningful and active involvement of end users in the design, development, and validation of AI interventions, alongside improved transparency in reporting AI models, data handling, and analytical processes to build trust and support safe clinical implementation.

2025-06-05

JMIR Mental Health (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications