Vedant Shah

Low Compute Unlearning via Sparse Representations

Frederik Träuble

Ashish Malik

Hugo Larochelle

Michael Curtis Mozer

Sanjeev Arora

Anirudh Goyal

Machine unlearning, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible … (voir plus)using existing techniques. We propose a low-compute unlearning technique based on a discrete representational bottleneck. We show that the proposed technique efficiently unlearns the forget set and incurs negligible damage to the model's performance on the rest of the dataset. We evaluate the proposed technique on the problem of class unlearning using four datasets: CIFAR-10, CIFAR-100, LACUNA-100 and ImageNet-1k. We compare the proposed technique to SCRUB, a state-of-the-art approach which uses knowledge distillation for unlearning. Across all four datasets, the proposed technique performs as well as, if not better than SCRUB while incurring almost no computational cost.

2025-09-15

TMLR (accepté)

General Causal Imputation via Synthetic Interventions

Given two sets of elements (such as cell types and drug compounds), researchers typically only have access to a limited subset of their inte… (voir plus)ractions. The task of causal imputation involves using this subset to predict unobserved interactions. Squires et al. (2022) have proposed two estimators for this task based on the synthetic interventions (SI) estimator: SI-A (for actions) and SI-C (for contexts). We extend their work and introduce a novel causal imputation estimator, generalized synthetic interventions (GSI). We prove the identifiability of this estimator for data generated from a more complex latent factor model. On synthetic and real data we show empirically that it recovers or outperforms their estimators.

2024-10-30

NeurIPS.cc/2024/Workshop/CRL (poster)

General Causal Imputation via Synthetic Interventions

Given two sets of elements (such as cell types and drug compounds), researchers typically only have access to a limited subset of their inte… (voir plus)ractions. The task of causal imputation involves using this subset to predict unobserved interactions. Squires et al. (2022) have proposed two estimators for this task based on the synthetic interventions (SI) estimator: SI-A (for actions) and SI-C (for contexts). We extend their work and introduce a novel causal imputation estimator, generalized synthetic interventions (GSI). We prove the identifiability of this estimator for data generated from a more complex latent factor model. On synthetic and real data we show empirically that it recovers or outperforms their estimators.

2024-10-28

ArXiv (prépublication)

General Causal Imputation via Synthetic Interventions

2024-10-28

ArXiv (prépublication)

AI-Assisted Generation of Difficult Math Questions

Dingli Yu

Kaifeng Lyu

Simon Park

Nan Rosemary Ke

Jiatong Yu

Yinghui He

Michael Curtis Mozer

James Lloyd McClelland

Sanjeev Arora

Anirudh Goyal

Current LLM training positions mathematical reasoning as a core capability. With publicly available sources fully tapped, there is unmet dem… (voir plus)and for diverse and challenging math questions. Relying solely on human experts is both time-consuming and costly, while LLM-generated questions often lack the requisite diversity and difficulty. We present a design framework that combines the strengths of LLMs with a human-in-the-loop approach to generate a diverse array of challenging math questions. We leverage LLM metacognition skills [Didolkar et al., 2024] of a strong LLM to extract core"skills"from existing math datasets. These skills serve as the basis for generating novel and difficult questions by prompting the LLM with random pairs of core skills. The use of two different skills within each question makes finding such questions an"out of distribution"task for both LLMs and humans. Our pipeline employs LLMs to iteratively generate and refine questions and solutions through multiturn prompting. Human annotators then verify and further refine the questions, with their efficiency enhanced via further LLM interactions. Applying this pipeline on skills extracted from the MATH dataset [Hendrycks et al., 2021] resulted in MATH

2024-10-09

NeurIPS.cc/2024/Workshop/MATH-AI (accepté)

Efficient Causal Graph Discovery Using Large Language Models

Yash More

2024-03-05

ICLR.cc/2024/Workshop/AGI (poster)

Towards DNA-Encoded Library Generation with GFlowNets

Michał Koziarski

Mohammed Abukalam

Louis Vaillancourt

Doris Alexandra Schuetz

Moksh J. Jain

Almer M. van der Sloot

Mathieu Bourgey

Anne Marinier

2024-03-04

ICLR.cc/2024/Workshop/GEM (poster)

Efficient Causal Graph Discovery Using Large Language Models

Yash More

We propose a novel framework that leverages LLMs for full causal graph discovery. While previous LLM-based methods have used a pairwise quer… (voir plus)y approach, this requires a quadratic number of queries which quickly becomes impractical for larger causal graphs. In contrast, the proposed framework uses a breadth-first search (BFS) approach which allows it to use only a linear number of queries. We also show that the proposed method can easily incorporate observational data when available, to improve performance. In addition to being more time and data-efficient, the proposed framework achieves state-of-the-art results on real-world causal graphs of varying sizes. The results demonstrate the effectiveness and efficiency of the proposed method in discovering causal relationships, showcasing its potential for broad applicability in causal graph discovery tasks across different domains.

2024-02-02

ArXiv (prépublication)

Unlearning via Sparse Representations

Frederik Träuble

Ashish Malik

Hugo Larochelle

Michael Curtis Mozer

Sanjeev Arora

Anirudh Goyal

2023-11-26

ArXiv (preprint)

Unlearning via Sparse Representations

Frederik Träuble

Ashish Malik

Hugo Larochelle

Michael Curtis Mozer

Sanjeev Arora

Anirudh Goyal

Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infea… (voir plus)sible by existing techniques. We propose a nearly compute-free zero-shot unlearning technique based on a discrete representational bottleneck. We show that the proposed technique efficiently unlearns the forget set and incurs negligible damage to the model's performance on the rest of the data set. We evaluate the proposed technique on the problem of \textit{class unlearning} using three datasets: CIFAR-10, CIFAR-100, and LACUNA-100. We compare the proposed technique to SCRUB, a state-of-the-art approach which uses knowledge distillation for unlearning. Across all three datasets, the proposed technique performs as well as, if not better than SCRUB while incurring almost no computational cost.

2023-11-26

ArXiv (prépublication)

Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning

Dianbo Liu