Publications

Massive Extremely High-Velocity Outflow in the Quasar J164653.72+243942.2

Paola Rodríguez Hidalgo

Hyunseop 현섭 Choi 최

Patrick B. Hall

Karen M. Leighly

Liliana Flores

Mikel M. Charles

Cora DeFrancesco

J. Hlavacek-Larrondo

Laurence Perreault-Levasseur

2025-09-04

The Astrophysical Journal (published)

doi.org

arxiv.org

Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients

Gwen Legate

Irina Rish

Eugene Belilovsky

Federated learning enables collaborative model training across numerous edge devices without requiring participants to share data; however, … (see more)memory and communication constraints on these edge devices may preclude their participation in training. We consider a setting in which a subset of edge devices are below a critical memory or communication threshold required to conduct model updates. Under typical federated optimization algorithms, these devices are excluded from training which renders their data inaccessible and increases system induced bias. We are inspired by MeZO, a zeroth-order method used for memory-efficient fine-tuning. The increased variance inherent to zeroth-order gradient approximations has relegated previous zeroth-order optimizers exclusively to the domain of fine tuning; a limitation we seek to correct. We devise a federated, memory-efficient zeroth-order optimizer, ZOWarmUp that permits zeroth-order training from a random initialization. ZOWarmUp leverages differing client capabilities and careful variance reduction techniques to facilitate participation of under-represented, low-resource clients in model training. Like other federated zeroth-order methods, ZOWarmUp eliminates the need for edge devices to transmit their full gradients to the server and instead relies on only a small set of random seeds, rendering the up-link communication cost negligible. We present experiments using various datasets and model architectures to show that ZOWarmUp is a robust algorithm that can can be applied under a wide variety of circumstances. For systems with a high proportion of edge devices that would otherwise be excluded from training, this algorithm provides access to a greater volume and diversity of data, thus improving training outcomes.

2025-09-03

ArXiv (preprint)

arxiv.org

Behaviour Discovery and Attribution for Explainable Reinforcement Learning

Rishav

Somjit Nath

Vincent Michalski

Samira Ebrahimi Kahou

2025-09-02

TMLR (accepted)

openreview.net

Learning Laplacian Eigenvectors: a Pre-training Method for Graph Neural Networks

Howard Dai

Nyambura Njenga

Benjamin Whitsett

Catherine Ma

Darwin Deng

Sara de 'Angel

Alexandre Van Tassel

Siddharth Viswanath

Ryan Pellico

Ian Adelstein

Smita Krishnaswamy

2025-09-02

ArXiv (preprint)

arxiv.org

Early Deforestation Detection in the Tropics using L-band SAR and Optical multi-sensor data and Bayesian Statistics

Africa I. Flores-Anderson

Jeff Cardille

Josef Kellndorfer

Franz J. Meyer

Pontus Olofsson

2025-09-01

International Journal of Applied Earth Observation and Geoinformation (published)

doi.org

Metabolic Control and Frequency of Clinical Monitoring Among Canadian Children With Phenylalanine Hydroxylase Deficiency: A Retrospective Cohort Study

Nataliya Yuskiv

Ammar Saad

Beth K. Potter

Sylvia Stockler‐Ipsiroglu

John J. Mitchell

Steven Hawken

Kylie Tingley

Michael Pugliese

Monica Lamoureux

Andrea J. Chow

Jonathan B. Kronick

Kumanan Wilson

Annette Feigenbaum

Sharan Goobie

Michal Inbar-Feigenberg

Julian Little

Saadet Mercimek‐Andrews

Amy Pender

Chitra Prasad

Andreas Schulze … (see 9 more)

Yannis Trakadis

Gloria Ho

Hilary Vallance

Valerie Austin

Anthony Vandersteen

Andrea C. Yu

Cheryl Rockman‐Greenberg

Aizeddin Mhanni

Pranesh Chakraborty

2025-09-01

JIMD Reports (published)

doi.org

Relative Trajectory Balance is equivalent to Trust-PCL

2025-09-01

ArXiv (preprint)

arxiv.org

Using machine learning to predict the consumption of a Mediterranean diet with untargeted metabolomics data from controlled feeding studies.

Mélina Côté

Didier Brassard

Pier-Luc Plante

Francis Brière

Jacques Corbeil

P. Couture

Simone Lemieux

B. Lamarche

2025-09-01

Nutrition, Metabolism and Cardiovascular Diseases (published)

doi.org

A Multimodal and Multi-centric Head and Neck Cancer Dataset for Tumor Segmentation and Outcome Prediction

Numan Saeed

Salma Hassan

Shahad Hardan

Ahmed Aly

Darya Taratynova

Umair Nawaz

Ufaq Khan

Muhammad Ridzuan

Vincent Andrearczyk

Adrien Depeursinge

Mathieu Hatt

Thomas Eugene

Raphael Metz

M'elanie Dore

G. Delpon

V. Papineni

K. Wahid

Cem Dede

A. M. Ali

Carlos Sjogreen … (see 19 more)

Mohamed A. Naser

Clifton D Fuller

Valentin Oreiller

Mario Jreige

J. Prior

Catherine Cheze Le Rest

Olena Tankyevych

P. Decazes

Su Ruan

Stephanie Tanadini-Lang

Martin Vallières

Hesham M. Elhalawani

R. Abgral

R. Floch

K. Kerleguer

Ulrike Schick

M. Mauguen

Arman Rahmim

Mohammad Yaqub

We describe a publicly available multimodal dataset of annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies for head … (see more)and neck cancer research. The dataset includes 1123 FDG-PET/CT studies from patients with histologically confirmed head and neck cancer, acquired from 10 international medical centers. All examinations consisted of co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity across institutions. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following standardized guidelines and quality control measures. We provide anonymized NifTi files of all studies, along with expert-annotated segmentation masks, radiotherapy dose distribution for a subset of patients, and comprehensive clinical metadata. This metadata includes TNM staging, HPV status, demographics (age and gender), long-term follow-up outcomes, survival times, censoring indicators, and treatment information. We demonstrate how this dataset can be used for three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, providing benchmark results using state-of-the-art deep learning models, including UNet, SegResNet, and multimodal prognostic frameworks.

2025-08-30

ArXiv (preprint)

arxiv.org

A Multimodal and Multi-centric Head and Neck Cancer Dataset for Tumor Segmentation and Outcome Prediction

Numan Saeed

Salma Hassan

Shahad Hardan

Ahmed Aly

Darya Taratynova

Umair Nawaz

Ufaq Khan

Muhammad Ridzuan

Vincent Andrearczyk

Adrien Depeursinge

Mathieu Hatt

Thomas Eugene

Raphael Metz

M'elanie Dore

G. Delpon

V. Papineni

K. Wahid

Cem Dede

A. M. Ali

Carlos Sjogreen … (see 19 more)

Mohamed A. Naser

Clifton D Fuller

Valentin Oreiller

Mario Jreige

J. Prior

Catherine Cheze Le Rest

Olena Tankyevych

P. Decazes

Su Ruan

Stephanie Tanadini-Lang

Martin Vallières

Hesham M. Elhalawani

R. Abgral

R. Floch

K. Kerleguer

Ulrike Schick

M. Mauguen

Arman Rahmim

Mohammad Yaqub

We describe a publicly available multimodal dataset of annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies for head … (see more)and neck cancer research. The dataset includes 1123 FDG-PET/CT studies from patients with histologically confirmed head and neck cancer, acquired from 10 international medical centers. All examinations consisted of co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity across institutions. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following standardized guidelines and quality control measures. We provide anonymized NifTi files of all studies, along with expert-annotated segmentation masks, radiotherapy dose distribution for a subset of patients, and comprehensive clinical metadata. This metadata includes TNM staging, HPV status, demographics (age and gender), long-term follow-up outcomes, survival times, censoring indicators, and treatment information. We demonstrate how this dataset can be used for three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, providing benchmark results using state-of-the-art deep learning models, including UNet, SegResNet, and multimodal prognostic frameworks.

2025-08-30

ArXiv (preprint)

arxiv.org

Scalable Option Learning in High-Throughput Environments

Mikael Henaff

Scott Fujimoto

Michael Rabbat

Hierarchical reinforcement learning (RL) has the potential to enable effective decision-making over long timescales. Existing approaches, wh… (see more)ile promising, have yet to realize the benefits of large-scale training. In this work, we identify and solve several key challenges in scaling hierarchical RL to high-throughput environments. We propose Scalable Option Learning (SOL), a highly scalable hierarchical RL algorithm which achieves a 25x higher throughput compared to existing hierarchical methods. We train our hierarchical agents using 20 billion frames of experience on the complex game of NetHack, significantly surpassing flat agents and demonstrating positive scaling trends. We also validate our algorithm on MiniHack and Mujoco environments, showcasing its general applicability. Our code is open sourced at github.com/facebookresearch/sol.

2025-08-30

ArXiv (preprint)

arxiv.org

Assessing the exposure of buildings to long-term sea level rise across the Global South

M. Willard-Stepan

N. Gomez

Jeff Cardille

E. D. Galbraith

E. M. Bennett

2025-08-29

npj Urban Sustainability (published)

doi.org

AI Insights for Policymakers

Hugo Larochelle appointed Scientific Director of Mila

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Publications

AI Insights for Policymakers

Hugo Larochelle appointed Scientific Director of Mila

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Popular keywords:

Publications