Publications

Task Robustness via Re-Labelling Vision-Action Robot Data

Cyrus Neary

The recent trend in scaling models for robot learning has resulted in impressive policies that can perform various manipulation tasks and ge… (see more)neralize to novel scenarios. However, these policies continue to struggle with following instructions, likely due to the limited linguistic and action sequence diversity in existing robotics datasets. This paper introduces

2025-09-06

robot-learning.org/CoRL/2025/Workshop/Robot_Data (published)

openreview.net

Massive Extremely High-Velocity Outflow in the Quasar J164653.72+243942.2

Paola Rodríguez Hidalgo

Hyunseop 현섭 Choi 최

Patrick B. Hall

Karen M. Leighly

Liliana Flores

Mikel M. Charles

Cora DeFrancesco

J. Hlavacek-Larrondo

Laurence Perreault-Levasseur

2025-09-04

The Astrophysical Journal (published)

doi.org

arxiv.org

Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients

Gwen Legate

Irina Rish

Eugene Belilovsky

Federated learning enables collaborative model training across numerous edge devices without requiring participants to share data; however, … (see more)memory and communication constraints on these edge devices may preclude their participation in training. We consider a setting in which a subset of edge devices are below a critical memory or communication threshold required to conduct model updates. Under typical federated optimization algorithms, these devices are excluded from training which renders their data inaccessible and increases system induced bias. We are inspired by MeZO, a zeroth-order method used for memory-efficient fine-tuning. The increased variance inherent to zeroth-order gradient approximations has relegated previous zeroth-order optimizers exclusively to the domain of fine tuning; a limitation we seek to correct. We devise a federated, memory-efficient zeroth-order optimizer, ZOWarmUp that permits zeroth-order training from a random initialization. ZOWarmUp leverages differing client capabilities and careful variance reduction techniques to facilitate participation of under-represented, low-resource clients in model training. Like other federated zeroth-order methods, ZOWarmUp eliminates the need for edge devices to transmit their full gradients to the server and instead relies on only a small set of random seeds, rendering the up-link communication cost negligible. We present experiments using various datasets and model architectures to show that ZOWarmUp is a robust algorithm that can can be applied under a wide variety of circumstances. For systems with a high proportion of edge devices that would otherwise be excluded from training, this algorithm provides access to a greater volume and diversity of data, thus improving training outcomes.

2025-09-03

ArXiv (preprint)

doi.org

arxiv.org

Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients

Gwen Legate

Irina Rish

Eugene Belilovsky

Federated learning enables collaborative model training across numerous edge devices without requiring participants to share data; however, … (see more)memory and communication constraints on these edge devices may preclude their participation in training. We consider a setting in which a subset of edge devices are below a critical memory or communication threshold required to conduct model updates. Under typical federated optimization algorithms, these devices are excluded from training which renders their data inaccessible and increases system induced bias. We are inspired by MeZO, a zeroth-order method used for memory-efficient fine-tuning. The increased variance inherent to zeroth-order gradient approximations has relegated previous zeroth-order optimizers exclusively to the domain of fine tuning; a limitation we seek to correct. We devise a federated, memory-efficient zeroth-order optimizer, ZOWarmUp that permits zeroth-order training from a random initialization. ZOWarmUp leverages differing client capabilities and careful variance reduction techniques to facilitate participation of under-represented, low-resource clients in model training. Like other federated zeroth-order methods, ZOWarmUp eliminates the need for edge devices to transmit their full gradients to the server and instead relies on only a small set of random seeds, rendering the up-link communication cost negligible. We present experiments using various datasets and model architectures to show that ZOWarmUp is a robust algorithm that can can be applied under a wide variety of circumstances. For systems with a high proportion of edge devices that would otherwise be excluded from training, this algorithm provides access to a greater volume and diversity of data, thus improving training outcomes.

2025-09-03

ArXiv (preprint)

doi.org

arxiv.org

Behaviour Discovery and Attribution for Explainable Reinforcement Learning

Rishav

Somjit Nath

Vincent Michalski

Samira Ebrahimi Kahou

2025-09-02

TMLR (accepted)

openreview.net

A Graph Laplacian Eigenvector-based Pre-training Method for Graph Neural Networks

Howard Dai

Nyambura Njenga

Hiren Madhu

Siddharth Viswanath

Ryan Pellico

Ian Adelstein

Smita Krishnaswamy

The development of self-supervised graph pre-training methods is a crucial ingredient in recent efforts to design robust graph foundation mo… (see more)dels (GFMs). Structure-based pre-training methods are under-explored yet crucial for downstream applications which rely on underlying graph structure. In addition, pre-training traditional message passing GNNs to capture global and regional structure is often challenging due to the risk of oversmoothing as network depth increases. We address these gaps by proposing the Laplacian Eigenvector Learning Module (LELM), a novel pre-training module for graph neural networks (GNNs) based on predicting the low-frequency eigenvectors of the graph Laplacian. Moreover, LELM introduces a novel architecture that overcomes oversmoothing, allowing the GNN model to learn long-range interdependencies. Empirically, we show that models pre-trained via our framework outperform baseline models on downstream molecular property prediction tasks.

2025-09-02

ArXiv (preprint)

arxiv.org

Learning Laplacian Eigenvectors: a Pre-training Method for Graph Neural Networks

Howard Dai

Nyambura Njenga

Benjamin Whitsett

Catherine Ma

Darwin Deng

Sara de 'Angel

Alexandre Van Tassel

Siddharth Viswanath

Ryan Pellico

Ian Adelstein

Smita Krishnaswamy

2025-09-02

ArXiv (preprint)

arxiv.org

Building a General SimCLR Self-Supervised Foundation Model Across Neurological Diseases to Advance 3D Brain MRI Diagnoses

3D structural Magnetic Resonance Imaging (MRI) brain scans are commonly acquired in clinical settings to monitor a wide range of neurologica… (see more)l conditions, including neurodegenerative disorders and stroke. While deep learning models have shown promising results analyzing 3D MRI across a number of brain imaging tasks, most are highly tailored for specific tasks with limited labeled data, and are not able to generalize across tasks and/or populations. The development of self-supervised learning (SSL) has enabled the creation of large medical foundation models that leverage diverse, unlabeled datasets ranging from healthy to diseased data, showing significant success in 2D medical imaging applications. However, even the very few foundation models for 3D brain MRI that have been developed remain limited in resolution, scope, or accessibility. In this work, we present a general, high-resolution SimCLR-based SSL foundation model for 3D brain structural MRI, pre-trained on 18,759 patients (44,958 scans) from 11 publicly available datasets spanning diverse neurological diseases. We compare our model to Masked Autoencoders (MAE), as well as two supervised baselines, on four diverse downstream prediction tasks in both in-distribution and out-of-distribution settings. Our fine-tuned SimCLR model outperforms all other models across all tasks. Notably, our model still achieves superior performance when fine-tuned using only 20% of labeled training samples for predicting Alzheimer's disease. We use publicly available code and data, and release our trained model at https://github.com/emilykaczmarek/3D-Neuro-SimCLR, contributing a broadly applicable and accessible foundation model for clinical brain MRI analysis.

2025-09-01

arXiv (published)

doi.org

DIVERS-Bench: Evaluating Language Identification Across Domain Shifts and Code-Switching

Jessica Ojo

Zina Kamel

David Ifeoluwa Adelani

2025-09-01

arXiv (published)

doi.org

arxiv.org

Early Deforestation Detection in the Tropics using L-band SAR and Optical multi-sensor data and Bayesian Statistics

Africa I. Flores-Anderson

Jeff Cardille

Josef Kellndorfer

Franz J. Meyer

Pontus Olofsson

2025-09-01

International Journal of Applied Earth Observation and Geoinformation (published)

doi.org

RL Fine-Tuning Heals OOD Forgetting in SFT

Sicheng Lyu

Mohammad Hamdaqa

The two-stage fine-tuning paradigm of Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL) has empirically shown better reas… (see more)oning performance than one-stage SFT for the post-training of Large Language Models (LLMs). However, the evolution and mechanism behind the synergy of SFT and RL are still under-explored and inconclusive. In our study, we find the well-known claim "SFT memorizes, RL generalizes" is over-simplified, and discover that: (1) OOD performance peaks at the early stage of SFT and then declines (OOD forgetting), the best SFT checkpoint cannot be captured by training/test loss; (2) the subsequent RL stage does not generate fundamentally better OOD capability, instead it plays an \textbf{OOD restoration} role, recovering the lost reasoning ability during SFT; (3) The recovery ability has boundaries, \ie{} \textbf{if SFT trains for too short or too long, RL cannot recover the lost OOD ability;} (4) To uncover the underlying mechanisms behind the forgetting and restoration process, we employ SVD analysis on parameter matrices, manually edit them, and observe their impacts on model performance. Unlike the common belief that the shift of model capacity mainly results from the changes of singular values, we find that they are actually quite stable throughout fine-tuning. Instead, the OOD behavior strongly correlates with the \textbf{rotation of singular vectors}. Our findings re-identify the roles of SFT and RL in the two-stage fine-tuning and discover the rotation of singular vectors as the key mechanism. %reversing the rotations induced by SFT, which shows recovery from forgetting, whereas imposing the SFT parameter directions onto a RL-tuned model results in performance degradation. Code is available at https://github.com/xiaodanguoguo/RL_Heals_SFT

2025-09-01

arXiv (published)

doi.org

RL Fine-Tuning Heals OOD Forgetting in SFT

Sicheng Lyu

Mohammad Hamdaqa

2025-09-01