Dhanya Sridhar

Alexandre Drouin

Peer Nowack

Jakob Runge

David Rolnick

Scientific research often seeks to understand the causal structure underlying high-level variables in a system. For example, climate scienti… (see more)sts study how phenomena, such as El Ni\~no, affect other climate processes at remote locations across the globe. However, scientists typically collect low-level measurements, such as geographically distributed temperature readings. From these, one needs to learn both a mapping to causally-relevant latent variables, such as a high-level representation of the El Ni\~no phenomenon and other processes, as well as the causal model over them. The challenge is that this task, called causal representation learning, is highly underdetermined from observational data alone, requiring other constraints during learning to resolve the indeterminacies. In this work, we consider a temporal model with a sparsity assumption, namely single-parent decoding: each observed low-level variable is only affected by a single latent variable. Such an assumption is reasonable in many scientific applications that require finding groups of low-level variables, such as extracting regions from geographically gridded measurement data in climate research or capturing brain regions from neural activity data. We demonstrate the identifiability of the resulting model and propose a differentiable method, Causal Discovery with Single-parent Decoding (CDSD), that simultaneously learns the underlying latents and a causal graph over them. We assess the validity of our theoretical results using simulated data and showcase the practical validity of our method in an application to real-world data from the climate science field.

2024-10-09

ArXiv (preprint)

In-Context Learning, Can It Break Safety?

Sophie Xhonneux

David Dobre

Michael Noukhovitch

Jian Tang

Gauthier Gidel

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

Demystifying amortized causal discovery with transformers

Francesco Montagna

Max Cairney-Leeming

Francesco Locatello

Supervised learning approaches for causal discovery from observational data often achieve competitive performance despite seemingly avoiding… (see more) explicit assumptions that traditional methods make for identifiability. In this work, we investigate CSIvA \citep{ke2023learning}, a transformer-based model promising to train on synthetic data and transfer to real data. First, we bridge the gap with existing identifiability theory and show that constraints on the training data distribution implicitly define a prior on the test observations. Consistent with classical approaches, good performance is achieved when we have a good prior on the test data, and the underlying model is identifiable. At the same time, we find new trade-offs. Training on datasets generated from different classes of causal models, unambiguously identifiable in isolation, improves the test generalization. Performance is still guaranteed, as the ambiguous cases resulting from the mixture of identifiable causal models are unlikely to occur (which we formally prove). Overall, our study finds that amortized causal discovery still needs to obey identifiability theory, but it also differs from classical methods in how the assumptions are formulated, trading more reliance on assumptions on the noise type for fewer hypotheses on the mechanisms.

2024-06-17

ICML.cc/2024/Workshop/SPIGM (poster)

Sparsity regularization via tree-structured environments for disentangled representations

Elliot Layne

Jason Hartford

Mathieu Blanchette

2024-05-30

ArXiv (preprint)

Sparsity regularization via tree-structured environments for disentangled representations

Elliot Layne

Jason Hartford

Mathieu Blanchette

Many causal systems such as biological processes in cells can only be observed indirectly via measurements, such as gene expression. Causal … (see more)representation learning -- the task of correctly mapping low-level observations to latent causal variables -- could advance scientific understanding by enabling inference of latent variables such as pathway activation. In this paper, we develop methods for inferring latent variables from multiple related datasets (environments) and tasks. As a running example, we consider the task of predicting a phenotype from gene expression, where we often collect data from multiple cell types or organisms that are related in known ways. The key insight is that the mapping from latent variables driven by gene expression to the phenotype of interest changes sparsely across closely related environments. To model sparse changes, we introduce Tree-Based Regularization (TBR), an objective that minimizes both prediction error and regularizes closely related environments to learn similar predictors. We prove that under assumptions about the degree of sparse changes, TBR identifies the true latent variables up to some simple transformations. We evaluate the theory empirically with both simulations and ground-truth gene expression data. We find that TBR recovers the latent causal variables better than related methods across these settings, even under settings that violate some assumptions of the theory.

2024-05-30

ArXiv (preprint)

Gintare Karolina Dziugaite

Evaluating Interventional Reasoning Capabilities of Large Language Models

Tejas Kasetty

Divyat Mahajan

Alexandre Drouin

Numerous decision-making tasks require estimating causal effects under interventions on different parts of a system. As practitioners consid… (see more)er using large language models (LLMs) to automate decisions, studying their causal reasoning capabilities becomes crucial. A recent line of work evaluates LLMs ability to retrieve commonsense causal facts, but these evaluations do not sufficiently assess how LLMs reason about interventions. Motivated by the role that interventions play in causal inference, in this paper, we conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention. We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning. These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts. We evaluate six LLMs on the benchmarks, finding that GPT models show promising accuracy at predicting the intervention effects.

2024-04-08

ArXiv (preprint)

Gintare Karolina Dziugaite

Evaluating Interventional Reasoning Capabilities of Large Language Models

Tejas Kasetty

Divyat Mahajan

Alexandre Drouin

Numerous decision-making tasks require estimating causal effects under interventions on different parts of a system. As practitioners consid… (see more)er using large language models (LLMs) to automate decisions, studying their causal reasoning capabilities becomes crucial. A recent line of work evaluates LLMs ability to retrieve commonsense causal facts, but these evaluations do not sufficiently assess how LLMs reason about interventions. Motivated by the role that interventions play in causal inference, in this paper, we conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention. We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning. These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts. We evaluate six LLMs on the benchmarks, finding that GPT models show promising accuracy at predicting the intervention effects.

2024-04-08

ArXiv (preprint)

Explicit Knowledge Factorization Meets In-Context Learning: What Do We Gain?

Sarthak Mittal

Eric Elmoznino

Leo Gagnon

Sangnie Bhardwaj

Guillaume Lajoie

2024-03-05

ICLR.cc/2024/Workshop/R2-FM (poster)

In-Context Learning Can Re-learn Forbidden Tasks

Sophie Xhonneux

David Dobre

Jian Tang

Gauthier Gidel

Despite significant investment into safety training, large language models (LLMs) deployed in the real world still suffer from numerous vuln… (see more)erabilities. One perspective on LLM safety training is that it algorithmically forbids the model from answering toxic or harmful queries. To assess the effectiveness of safety training, in this work, we study forbidden tasks, i.e., tasks the model is designed to refuse to answer. Specifically, we investigate whether in-context learning (ICL) can be used to re-learn forbidden tasks despite the explicit fine-tuning of the model to refuse them. We first examine a toy example of refusing sentiment classification to demonstrate the problem. Then, we use ICL on a model fine-tuned to refuse to summarise made-up news articles. Finally, we investigate whether ICL can undo safety training, which could represent a major security risk. For the safety task, we look at Vicuna-7B, Starling-7B, and Llama2-7B. We show that the attack works out-of-the-box on Starling-7B and Vicuna-7B but fails on Llama2-7B. Finally, we propose an ICL attack that uses the chat template tokens like a prompt injection attack to achieve a better attack success rate on Vicuna-7B and Starling-7B. Trigger Warning: the appendix contains LLM-generated text with violence, suicide, and misinformation.

2024-02-08

ArXiv (preprint)

Learning Macro Variables with Auto-encoders

Eric Elmoznino

Maitreyi Swaroop

2023-10-27

NeurIPS.cc/2023/Workshop/CRL (poster)

Adjusting Machine Learning Decisions for Equal Opportunity and Counterfactual Fairness

Yixin Wang

David Blei

Machine learning ( ml ) methods have the potential to automate high-stakes decisions, such as bail admissions or credit lending, by analyzin… (see more)g and learning from historical data. But these algorithmic decisions may be unfair: in learning from historical data, they may replicate discriminatory practices from the past. In this paper, we propose two algorithms that adjust ﬁtted ML predictors to produce decisions that are fair. Our methods provide post-hoc adjustments to the predictors, without requiring that they be retrained. We consider a causal model of the ML decisions, deﬁne fairness through counterfactual decisions within the model, and then form algorithmic decisions that capture the historical data as well as possible, but are provably fair. In particular, we consider two deﬁnitions of fairness. The ﬁrst is “equal counterfactual opportunity,” where the counterfactual distribution of the decision is the same regardless of the protected attribute; the second is counterfactual fairness. We evaluate the algorithms, and the trade-o � between accuracy and fairness, on datasets about admissions, income, credit, and recidivism.

2023-01-01

Trans. Mach. Learn. Res. (published)

Identifiable Deep Generative Models via Sparse Decoding

Gemma Elyse Moran

Yixin Wang

David Blei

We develop the sparse VAE for unsupervised representation learning on high-dimensional data. The sparse VAE learns a set of latent factors … (see more)(representations) which summarize the associations in the observed data features. The underlying model is sparse in that each observed feature (i.e. each dimension of the data) depends on a small subset of the latent factors. As examples, in ratings data each movie is only described by a few genres; in text data each word is only applicable to a few topics; in genomics, each gene is active in only a few biological processes. We prove such sparse deep generative models are identifiable: with infinite data, the true model parameters can be learned. (In contrast, most deep generative models are not identifiable.) We empirically study the sparse VAE with both simulated and real data. We find that it recovers meaningful latent factors and has smaller heldout reconstruction error than related methods.

2022-10-20

TMLR (accepted)