Gintare Karolina Dziugaite

Continual Learning in Vision-Language Models via Aligned Model Merging

Ghada Sokar

Anurag Arnab

Ahmet Iscen

Pablo Samuel Castro

Cordelia Schmid

Continual learning is conventionally tackled through sequential fine-tuning, a process that, while enabling adaptation, inherently favors pl… (voir plus)asticity over the stability needed to retain prior knowledge. While existing approaches attempt to mitigate catastrophic forgetting, a bias towards recent tasks persists as they build upon this sequential nature. In this work we present a new perspective based on model merging to maintain stability while still retaining plasticity. Rather than just sequentially updating the model weights, we propose merging newly trained task parameters with previously learned ones, promoting a better balance. To maximize the effectiveness of the merging process, we propose a simple mechanism that promotes learning aligned weights with previous ones, thereby avoiding interference when merging. We evaluate this approach on large Vision-Language Models (VLMs), and demonstrate its effectiveness in reducing forgetting, increasing robustness to various task orders and similarities, and improving generalization.

2025-05-30

ArXiv (prépublication)

From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization

Shoaib Ahmed Siddiqui

Adrian Weller

David Krueger 0001

M. C. Mozer

Eleni Triantafillou

Recent unlearning methods for LLMs are vulnerable to relearning attacks: knowledge believed-to-be-unlearned re-emerges by fine-tuning on a s… (voir plus)mall set of (even seemingly-unrelated) examples. We study this phenomenon in a controlled setting for example-level unlearning in vision classifiers. We make the surprising discovery that forget-set accuracy can recover from around 50% post-unlearning to nearly 100% with fine-tuning on just the retain set -- i.e., zero examples of the forget set. We observe this effect across a wide variety of unlearning methods, whereas for a model retrained from scratch excluding the forget set (gold standard), the accuracy remains at 50%. We observe that resistance to relearning attacks can be predicted by weight-space properties, specifically,

2025-05-28

ArXiv (prépublication)

From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization

Shoaib Ahmed Siddiqui

Adrian Weller

David Scott Krueger

Michael Curtis Mozer

Eleni Triantafillou

Recent unlearning methods for LLMs are vulnerable to relearning attacks: knowledge believed-to-be-unlearned re-emerges by fine-tuning on a s… (voir plus)mall set of (even seemingly-unrelated) examples. We study this phenomenon in a controlled setting for example-level unlearning in vision classifiers. We make the surprising discovery that forget-set accuracy can recover from around 50% post-unlearning to nearly 100% with fine-tuning on just the retain set -- i.e., zero examples of the forget set. We observe this effect across a wide variety of unlearning methods, whereas for a model retrained from scratch excluding the forget set (gold standard), the accuracy remains at 50%. We observe that resistance to relearning attacks can be predicted by weight-space properties, specifically,

2025-05-28

ArXiv (prépublication)

Nazanin Mohammadi Sepahvand

Leveraging Per-Instance Privacy for Machine Unlearning

Anvith Thudi

Berivan Isik

Ashmita Bhattacharyya

Nicolas Papernot

Eleni Triantafillou

Daniel M. Roy

2025-05-01

ICML.cc/2025/Conference (poster)

openreview.net

Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization

Phillip Huang Guo

Aaquib Syed

Abhay Sheshadri

Aidan Ewart

2025-05-01

ICML.cc/2025/Conference (poster)

openreview.net

The Size of Teachers as a Measure of Data Complexity: PAC-Bayes Excess Risk Bounds and Scaling Laws

Daniel M. Roy

We study the generalization properties of randomly initialized neural networks, under the assumption that the network is larger than some un… (voir plus)known "teacher" network that achieves low risk. We extend the analysis of Buzaglo et al. (2024) to allow for student networks of arbitrary width and depth, and to the setting where no (small) teacher network perfectly interpolates the data. We obtain an oracle inequality, relating the risk of Gibbs posterior sampling to that of narrow teacher networks. As a result, the sample complexity is once again bounded in terms of the size of narrow teacher networks that themselves achieve small risk. We then introduce a new notion of data complexity, based on the minimal size of a teacher network required to achieve a certain level of excess risk. By comparing the scaling laws resulting from our bounds to those observed in empirical studies, we are able to estimate the data complexity of standard benchmarks according to our measure.

2025-04-23

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics (publié)

proceedings.mlr.press

openreview.net

On the Dichotomy Between Privacy and Traceability in ℓp Stochastic Convex Optimization

Sasha Voitovych

MAHDI HAGHIFAM

Idan Attias