Publications

Massive Extremely High-Velocity Outflow in the Quasar J164653.72+243942.2
Paola Rodríguez Hidalgo
Hyunseop 현섭 Choi 최
Patrick B. Hall
Karen M. Leighly
Liliana Flores
Mikel M. Charles
Cora DeFrancesco
J. Hlavacek-Larrondo
Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients
Federated learning enables collaborative model training across numerous edge devices without requiring participants to share data; however, … (see more)memory and communication constraints on these edge devices may preclude their participation in training. We consider a setting in which a subset of edge devices are below a critical memory or communication threshold required to conduct model updates. Under typical federated optimization algorithms, these devices are excluded from training which renders their data inaccessible and increases system induced bias. We are inspired by MeZO, a zeroth-order method used for memory-efficient fine-tuning. The increased variance inherent to zeroth-order gradient approximations has relegated previous zeroth-order optimizers exclusively to the domain of fine tuning; a limitation we seek to correct. We devise a federated, memory-efficient zeroth-order optimizer, ZOWarmUp that permits zeroth-order training from a random initialization. ZOWarmUp leverages differing client capabilities and careful variance reduction techniques to facilitate participation of under-represented, low-resource clients in model training. Like other federated zeroth-order methods, ZOWarmUp eliminates the need for edge devices to transmit their full gradients to the server and instead relies on only a small set of random seeds, rendering the up-link communication cost negligible. We present experiments using various datasets and model architectures to show that ZOWarmUp is a robust algorithm that can can be applied under a wide variety of circumstances. For systems with a high proportion of edge devices that would otherwise be excluded from training, this algorithm provides access to a greater volume and diversity of data, thus improving training outcomes.
Behaviour Discovery and Attribution for Explainable Reinforcement Learning
Learning Laplacian Eigenvectors: a Pre-training Method for Graph Neural Networks
Howard Dai
Nyambura Njenga
Benjamin Whitsett
Catherine Ma
Darwin Deng
Sara de 'Angel
Alexandre Van Tassel
Siddharth Viswanath
Ryan Pellico
Ian Adelstein
Early Deforestation Detection in the Tropics using L-band SAR and Optical multi-sensor data and Bayesian Statistics
Africa I. Flores-Anderson
Josef Kellndorfer
Franz J. Meyer
Pontus Olofsson
Metabolic Control and Frequency of Clinical Monitoring Among Canadian Children With Phenylalanine Hydroxylase Deficiency: A Retrospective Cohort Study
Nataliya Yuskiv
Ammar Saad
Beth K. Potter
Sylvia Stockler‐Ipsiroglu
John J. Mitchell
Steven Hawken
Kylie Tingley
Michael Pugliese
Monica Lamoureux
Andrea J. Chow
Jonathan B. Kronick
Kumanan Wilson
Annette Feigenbaum
Sharan Goobie
Michal Inbar-Feigenberg
Julian Little
Saadet Mercimek‐Andrews
Amy Pender
Chitra Prasad
Andreas Schulze … (see 9 more)
Gloria Ho
Hilary Vallance
Valerie Austin
Anthony Vandersteen
Andrea C. Yu
Cheryl Rockman‐Greenberg
Aizeddin Mhanni
Pranesh Chakraborty
Relative Trajectory Balance is equivalent to Trust-PCL
Using machine learning to predict the consumption of a Mediterranean diet with untargeted metabolomics data from controlled feeding studies.
Mélina Côté
Didier Brassard
Pier-Luc Plante
Francis Brière
P. Couture
Simone Lemieux
B. Lamarche
A Multimodal and Multi-centric Head and Neck Cancer Dataset for Tumor Segmentation and Outcome Prediction
Numan Saeed
Salma Hassan
Shahad Hardan
Ahmed Aly
Darya Taratynova
Umair Nawaz
Ufaq Khan
Muhammad Ridzuan
Vincent Andrearczyk
Adrien Depeursinge
Mathieu Hatt
Thomas Eugene
Raphael Metz
M'elanie Dore
G. Delpon
V. Papineni
K. Wahid
Cem Dede
A. M. Ali
Carlos Sjogreen … (see 19 more)
Mohamed A. Naser
Clifton D Fuller
Valentin Oreiller
Mario Jreige
J. Prior
Catherine Cheze Le Rest
Olena Tankyevych
P. Decazes
Su Ruan
Stephanie Tanadini-Lang
Hesham M. Elhalawani
R. Abgral
R. Floch
K. Kerleguer
Ulrike Schick
M. Mauguen
Arman Rahmim
Mohammad Yaqub
We describe a publicly available multimodal dataset of annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies for head … (see more)and neck cancer research. The dataset includes 1123 FDG-PET/CT studies from patients with histologically confirmed head and neck cancer, acquired from 10 international medical centers. All examinations consisted of co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity across institutions. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following standardized guidelines and quality control measures. We provide anonymized NifTi files of all studies, along with expert-annotated segmentation masks, radiotherapy dose distribution for a subset of patients, and comprehensive clinical metadata. This metadata includes TNM staging, HPV status, demographics (age and gender), long-term follow-up outcomes, survival times, censoring indicators, and treatment information. We demonstrate how this dataset can be used for three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, providing benchmark results using state-of-the-art deep learning models, including UNet, SegResNet, and multimodal prognostic frameworks.
A Multimodal and Multi-centric Head and Neck Cancer Dataset for Tumor Segmentation and Outcome Prediction
Numan Saeed
Salma Hassan
Shahad Hardan
Ahmed Aly
Darya Taratynova
Umair Nawaz
Ufaq Khan
Muhammad Ridzuan
Vincent Andrearczyk
Adrien Depeursinge
Mathieu Hatt
Thomas Eugene
Raphael Metz
M'elanie Dore
G. Delpon
V. Papineni
K. Wahid
Cem Dede
A. M. Ali
Carlos Sjogreen … (see 19 more)
Mohamed A. Naser
Clifton D Fuller
Valentin Oreiller
Mario Jreige
J. Prior
Catherine Cheze Le Rest
Olena Tankyevych
P. Decazes
Su Ruan
Stephanie Tanadini-Lang
Hesham M. Elhalawani
R. Abgral
R. Floch
K. Kerleguer
Ulrike Schick
M. Mauguen
Arman Rahmim
Mohammad Yaqub
We describe a publicly available multimodal dataset of annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies for head … (see more)and neck cancer research. The dataset includes 1123 FDG-PET/CT studies from patients with histologically confirmed head and neck cancer, acquired from 10 international medical centers. All examinations consisted of co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity across institutions. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following standardized guidelines and quality control measures. We provide anonymized NifTi files of all studies, along with expert-annotated segmentation masks, radiotherapy dose distribution for a subset of patients, and comprehensive clinical metadata. This metadata includes TNM staging, HPV status, demographics (age and gender), long-term follow-up outcomes, survival times, censoring indicators, and treatment information. We demonstrate how this dataset can be used for three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, providing benchmark results using state-of-the-art deep learning models, including UNet, SegResNet, and multimodal prognostic frameworks.
Scalable Option Learning in High-Throughput Environments
Mikael Henaff
Scott Fujimoto
Hierarchical reinforcement learning (RL) has the potential to enable effective decision-making over long timescales. Existing approaches, wh… (see more)ile promising, have yet to realize the benefits of large-scale training. In this work, we identify and solve several key challenges in scaling hierarchical RL to high-throughput environments. We propose Scalable Option Learning (SOL), a highly scalable hierarchical RL algorithm which achieves a 25x higher throughput compared to existing hierarchical methods. We train our hierarchical agents using 20 billion frames of experience on the complex game of NetHack, significantly surpassing flat agents and demonstrating positive scaling trends. We also validate our algorithm on MiniHack and Mujoco environments, showcasing its general applicability. Our code is open sourced at github.com/facebookresearch/sol.
Assessing the exposure of buildings to long-term sea level rise across the Global South
M. Willard-Stepan
N. Gomez
E. D. Galbraith
E. M. Bennett