Publications

Learning Penalty for Optimal Partitioning via Automatic Feature Extraction

Tung L. Nguyen

Toby Dylan Hocking

Changepoint detection identifies significant shifts in data sequences, making it important in areas like finance, genetics, and healthcare. … (see more)The Optimal Partitioning algorithms efficiently detect these changes, using a penalty parameter to limit the changepoints number. Determining the appropriate value for this penalty can be challenging. Traditionally, this process involved manually extracting statistical features, such as sequence length or variance to make the prediction. This study proposes a novel approach that uses recurrent neural networks to learn this penalty directly from raw sequences by automatically extracting features. Experiments conducted on 20 benchmark genomic datasets show that this novel method surpasses traditional methods in partitioning accuracy in most cases.

2025-05-01

arXiv (published)

Gintare Karolina Dziugaite

Leveraging Per-Instance Privacy for Machine Unlearning

Nazanin Mohammadi Sepahvand

Anvith Thudi

Berivan Isik

Ashmita Bhattacharyya

Nicolas Papernot

Eleni Triantafillou

Daniel M. Roy

2025-05-01

ICML.cc/2025/Conference (poster)

LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs

Pooneh Mousavi

Shubham Gupta

Cem Subakan

Mirco Ravanelli

Foundation models based on large language models (LLMs) have shown great success in handling various tasks and modalities. However, adapting… (see more) these models for general-purpose audio-language tasks is challenging due to differences in acoustic environments and task variations. In this work, we introduce LiSTEN Learning Soft Token Embeddings for Neural Audio LLMs), a framework for adapting LLMs to speech and audio tasks. LiSTEN uses a dynamic prompt selection strategy with learnable key-value pairs, allowing the model to balance general and task-specific knowledge while avoiding overfitting in a multitask setting. Our approach reduces dependence on large-scale ASR or captioning datasets, achieves competitive performance with fewer trainable parameters, and simplifies training by using a single-stage process. Additionally, LiSTEN enhances interpretability by analyzing the diversity and overlap of selected prompts across different tasks.

2025-05-01

arXiv (published)

Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D

Paul McVay

Sergio Arnaud

Ada Martin

Arjun Majumdar

Krishna Murthy

Phillip Thomas

Ruslan Partsey

Daniel Dugas

Abha Gejji

Alexander Sax

Vincent-Pierre Berges

Mikael Henaff

Ayush Jain

Ang Cao

Ishita Prasad

Mrinal Kalakrishnan

Michael Rabbat

Nicolas Ballas

Mahmoud Assran

Oleksandr Maksymets … (see 2 more)

Aravind Rajeswaran

Franziska Meier

2025-05-01

ICML.cc/2025/Conference (poster)

Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning

Jiashun Liu

Zihao Wu

Johan Samir Obando Ceron

Pablo Samuel Castro

Aaron Courville

Ling Pan

2025-05-01

arXiv (published)

Gintare Karolina Dziugaite

Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization

Phillip Huang Guo

Aaquib Syed

Abhay Sheshadri

Aidan Ewart

2025-05-01

ICML.cc/2025/Conference (poster)

Mind the GAP! The Challenges of Scale in Pixel-based Deep Reinforcement Learning

Ghada Sokar

Pablo Samuel Castro

2025-05-01

arXiv (published)

Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn

Hongyao Tang

Johan Samir Obando Ceron

Pablo Samuel Castro

Aaron Courville

Glen Berseth

Plasticity, or the ability of an agent to adapt to new tasks, environments, or distributions, is crucial for continual learning. In this pap… (see more)er, we study the loss of plasticity in deep continual RL from the lens of churn: network output variability for out-of-batch data induced by mini-batch training. We demonstrate that (1) the loss of plasticity is accompanied by the exacerbation of churn due to the gradual rank decrease of the Neural Tangent Kernel (NTK) matrix; (2) reducing churn helps prevent rank collapse and adjusts the step size of regular RL gradients adaptively. Moreover, we introduce Continual Churn Approximated Reduction (C-CHAIN) and demonstrate it improves learning performance and outperforms baselines in a diverse range of continual learning environments on OpenAI Gym Control, ProcGen, DeepMind Control Suite, and MinAtar benchmarks.

2025-05-01

ICML.cc/2025/Conference (poster)

A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment

Jean-Philippe Corbeil

Amin Dada

Jean-Michel Attendu

Asma Ben Abacha

Alessandro Sordoni

Lucas Caccia

Franccois Beaulieu

Thomas Lin

Jens Kleesiek

Paul Vozila

High computation costs and latency of large language models such as GPT-4 have limited their deployment in clinical settings. Small language… (see more) models (SLMs) offer a cost-effective alternative, but their limited capacity requires biomedical domain adaptation, which remains challenging. An additional bottleneck is the unavailability and high sensitivity of clinical data. To address these challenges, we propose a novel framework for adapting SLMs into high-performing clinical models. We introduce the MediPhi collection of 3.8B-parameter SLMs developed with our novel framework: pre-instruction tuning of experts on relevant medical and clinical corpora (PMC, Medical Guideline, MedWiki, etc.), model merging, and clinical-tasks alignment. To cover most clinical tasks, we extended the CLUE benchmark to CLUE+, doubling its size. Our expert models deliver relative improvements on this benchmark over the base model without any task-specific fine-tuning: 64.3% on medical entities, 49.5% on radiology reports, and 44% on ICD-10 coding (outperforming GPT-4-0125 by 14%). We unify the expert models into MediPhi via model merging, preserving gains across benchmarks. Furthermore, we built the MediFlow collection, a synthetic dataset of 2.5 million high-quality instructions on 14 medical NLP tasks, 98 fine-grained document types, and JSON format support. Alignment of MediPhi using supervised fine-tuning and direct preference optimization achieves further gains of 18.9% on average.

2025-05-01

arXiv (published)

Monitoring morphometric drift in lifelong learning segmentation of the spinal cord

Enamundram Naga Karthik

Sandrine B'edard

Jan Valošek

Christoph Aigner

Elise Bannier

Josef Bednavr'ik

Virginie Callot

Anna Combes

Armin Curt

Gergely David

Falk Eippert

Lynn Farner

M. G. Fehlings

Patrick Freund

Tobias Granberg

Cristina Granziera

Rhscir Network Imaging Group

Ulrike Horn

Tom'avs Hor'ak

Suzanne Humphreys … (see 36 more)

Markus Hupp

Anne Kerbrat

Nawal Kinany

Shannon Kolind

Petr Kudlivcka

Anna Lebret

Lisa Eunyoung Lee

Caterina Mainero

Allan R. Martin

Megan McGrath

Govind Nair

Kristin P. O’Grady

Jiwon Oh

Russell Ouellette

Nikolai Pfender

Dario Pfyffer

P. Pradat

Alexandre Prat

Emanuele Pravatà

D. S. Reich

Ilaria Ricchi

Naama Rotem-Kohavi

Simon Schading-Sassenhausen

Maryam Seif

Andrew C. Smith

Seth Aaron Smith

Grace Sweeney

Roger Tam

Anthony Traboulsee

Constantina A. Treaba

Charidimos Tsagkas

Zachary Vavasour

Dimitri Van De Ville

Kenneth A. Weber

Sarath Chandar

Julien Cohen-Adad

2025-05-01

arXiv (published)

Monte Carlo Tree Diffusion for System 2 Planning

Jaesik Yoon

Hyeonseo Cho

Doojin Baek

Yoshua Bengio

Sungjin Ahn

Diffusion models have recently emerged as a powerful tool for planning. However, unlike Monte Carlo Tree Search (MCTS)-whose performance nat… (see more)urally improves with additional test-time computation (TTC), standard diffusion-based planners offer only limited avenues for TTC scalability. In this paper, we introduce Monte Carlo Tree Diffusion (MCTD), a novel framework that integrates the generative strength of diffusion models with the adaptive search capabilities of MCTS. Our method reconceptualizes denoising as a tree-structured process, allowing partially denoised plans to be iteratively evaluated, pruned, and refined. By selectively expanding promising trajectories while retaining the flexibility to revisit and improve suboptimal branches, MCTD achieves the benefits of MCTS such as controlling exploration-exploitation trade-offs within the diffusion framework. Empirical results on challenging long-horizon tasks show that MCTD outperforms diffusion baselines, yielding higher-quality solutions as TTC increases.

2025-05-01

ICML.cc/2025/Conference (poster)