Portrait of Gintare Karolina Dziugaite

Gintare Karolina Dziugaite

Associate Industry Member
Adjunct Professor, McGill University, School of Computer Science
Senior Research Scientist, Google DeepMind
Research Topics
Deep Learning
Information Theory
Machine Learning Theory

Biography

Gintare Karolina Dziugaite is a senior research scientist at Google DeepMind in Toronto, and an adjunct professor at the McGill University School of Computer Science. Prior to joining Google, she led the Trustworthy AI program at Element AI (ServiceNow). Her research combines theoretical and empirical approaches to understanding deep learning.

Dziugaite is well known for her work on network and data sparsity, developing algorithms and uncovering effects on generalization and other metrics. She pioneered the study of linear mode connectivity, first connecting it to the existence of lottery tickets, then to loss landscapes and the mechanism of iterative magnitude pruning. Another major focus of her research is understanding generalization in deep learning and, more generally, the development of information-theoretic methods for studying generalization. Her most recent work looks at removing the influence of data on the model (unlearning).

Dziugaite obtained her PhD in machine learning from the University of Cambridge under the supervision of Zoubin Ghahramani. Prior to that, she studied mathematics at the University of Warwick and read Part III in Mathematics at the University of Cambridge, receiving a Master of Advanced Study (MASt) in mathematics. She has participated in a number of long-term programs at the Institute for Advanced Study in Princeton, NJ, and at the Simons Institute for the Theory of Computing at the University of Berkeley.

Publications

Unlearning in- vs. out-of-distribution data in LLMs under gradient-based method
Teodora Băluță
Pascal Lamblin
Daniel Tarlow
Fabian Pedregosa
Machine unlearning aims to solve the problem of removing the influence of selected training examples from a learned model. Despite the incre… (see more)asing attention to this problem, it remains an open research question how to evaluate unlearning in large language models (LLMs), and what are the critical properties of the data to be unlearned that affect the quality and efficiency of unlearning. This work formalizes a metric to evaluate unlearning quality in generative models, and uses it to assess the trade-offs between unlearning quality and performance. We demonstrate that unlearning out-of-distribution examples requires more unlearning steps but overall presents a better trade-off overall. For in-distribution examples, however, we observe a rapid decay in performance as unlearning progresses. We further evaluate how example's memorization and difficulty affect unlearning under a classical gradient ascent-based approach.
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization
Phillip Huang Guo
Aaquib Syed
Abhay Sheshadri
Aidan Ewart
The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse
Ekansh Sharma
Daniel M. Roy
Unlearning in- vs. out-of-distribution data in LLMs under gradient-based methods
Teodora Băluță
Pascal Lamblin
Danny Tarlow
Fabian Pedregosa
Machine unlearning aims to solve the problem of removing the influence of selected training examples from a learned model. Despite the incre… (see more)asing attention to this problem, it remains an open research question how to evaluate unlearning in large language models (LLMs), and what are the critical properties of the data to be unlearned that affect the quality and efficiency of unlearning. This work formalizes a metric to evaluate unlearning quality in generative models, and uses it to assess the trade-offs between unlearning quality and performance. We demonstrate that unlearning out-of-distribution examples requires more unlearning steps but overall presents a better trade-off overall. For in-distribution examples, however, we observe a rapid decay in performance as unlearning progresses. We further evaluate how example's memorization and difficulty affect unlearning under a classical gradient ascent-based approach.
Evaluating Interventional Reasoning Capabilities of Large Language Models
Numerous decision-making tasks require estimating causal effects under interventions on different parts of a system. As practitioners consid… (see more)er using large language models (LLMs) to automate decisions, studying their causal reasoning capabilities becomes crucial. A recent line of work evaluates LLMs ability to retrieve commonsense causal facts, but these evaluations do not sufficiently assess how LLMs reason about interventions. Motivated by the role that interventions play in causal inference, in this paper, we conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention. We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning. These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts. Our analysis on four LLMs highlights that while GPT- 4 models show promising accuracy at predicting the intervention effects, they remain sensitive to distracting factors in the prompts.
Linear Weight Interpolation Leads to Transient Performance Gains
Robust Knowledge Unlearning via Mechanistic Localizations
Phillip Huang Guo
Aaquib Syed
Abhay Sheshadri
Aidan Ewart
Mixture of Experts in a Mixture of RL settings
Timon Willi
Johan Samir Obando Ceron
Jakob Nicolaus Foerster
Mixtures of Experts (MoEs) have gained prominence in (self-)supervised learning due to their enhanced inference efficiency, adaptability to … (see more)distributed training, and modularity. Previous research has illustrated that MoEs can significantly boost Deep Reinforcement Learning (DRL) performance by expanding the network's parameter count while reducing dormant neurons, thereby enhancing the model's learning capacity and ability to deal with non-stationarity. In this work, we shed more light on MoEs' ability to deal with non-stationarity and investigate MoEs in DRL settings with"amplified"non-stationarity via multi-task training, providing further evidence that MoEs improve learning capacity. In contrast to previous work, our multi-task results allow us to better understand the underlying causes for the beneficial effect of MoE in DRL training, the impact of the various MoE components, and insights into how best to incorporate them in actor-critic-based DRL networks. Finally, we also confirm results from previous work.
Robust Unlearning via Mechanistic Localizations
Phillip Huang Guo
Aaquib Syed
Abhay Sheshadri
Aidan Ewart
Methods for machine unlearning in large language models seek to remove undesirable knowledge or capabilities without compromising general la… (see more)nguage modeling performance. This work investigates the use of mechanistic interpretability to improve the precision and effectiveness of unlearning. We demonstrate that localizing unlearning to components with particular mechanisms in factual recall leads to more robust unlearning across different input/output formats, relearning, and latent knowledge, and reduces unintended side effects compared to nonlocalized unlearning. Additionally, we analyze the strengths and weaknesses of different automated (rather than manual) interpretability methods for guiding unlearning, finding that their corresponding unlearned models require smaller edit sizes to achieve unlearning but are much less robust.
Linear Weight Interpolation Leads to Transient Performance Gains
Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition
Eleni Triantafillou
Peter Kairouz
Fabian Pedregosa
Jamie Hayes
Meghdad Kurmanji
Kairan Zhao
Vincent Dumoulin
Julio C. S. Jacques Junior
Jun Wan
Lisheng Sun-Hosoya
Sergio Escalera
Peter Triantafillou
Isabelle Guyon
We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and in… (see more)itiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In this paper, we analyze top solutions and delve into discussions on benchmarking unlearning, which itself is a research problem. The evaluation methodology we developed for the competition measures forgetting quality according to a formal notion of unlearning, while incorporating model utility for a holistic evaluation. We analyze the effectiveness of different instantiations of this evaluation framework vis-a-vis the associated compute cost, and discuss implications for standardizing evaluation. We find that the ranking of leading methods remains stable under several variations of this framework, pointing to avenues for reducing the cost of evaluation. Overall, our findings indicate progress in unlearning, with top-performing competition entries surpassing existing algorithms under our evaluation framework. We analyze trade-offs made by different algorithms and strengths or weaknesses in terms of generalizability to new datasets, paving the way for advancing both benchmarking and algorithm development in this important area.
Data Selection for Transfer Unlearning
Nazanin Mohammadi Sepahvand
Vincent Dumoulin
Eleni Triantafillou