Stefan Bauer

Towards Causal Representation Learning

Bernhard Schölkopf

Francesco Locatello

Nan Rosemary Ke

Nal Kalchbrenner

The two fields of machine learning and graphical causality arose and developed separately. However, there is now cross-pollination and incre… (voir plus)asing interest in both fields to benefit from the advances of the other. In the present paper, we review fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: we note that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, the discovery of high-level causal variables from low-level observations. Finally, we delineate some implications of causality for machine learning and propose key research areas at the intersection of both communities.

2021-02-21

ArXiv (prépublication)

CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning

Manuel Wuthrich

Bernhard Schölkopf

Despite recent successes of reinforcement learning (RL), it remains a challenge for agents to transfer learned skills to related environment… (voir plus)s. To facilitate research addressing this problem, we propose CausalWorld, a benchmark for causal structure and transfer learning in a robotic manipulation environment. The environment is a simulation of an open-source robotic platform, hence offering the possibility of sim-to-real transfer. Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures. The key strength of CausalWorld is that it provides a combinatorial family of such tasks with common causal structure and underlying factors (including, e.g., robot and object masses, colors, sizes). The user (or the agent) may intervene on all causal variables, which allows for fine-grained control over how similar different tasks (or task distributions) are. One can thus easily define training and evaluation distributions of a desired difficulty level, targeting a specific form of generalization (e.g., only changes in appearance or object mass). Further, this common parametrization facilitates defining curricula by interpolating between an initial and a target task. While users may define their own task distributions, we present eight meaningful distributions as concrete benchmarks, ranging from simple to very challenging, all of which require long-horizon planning as well as precise low-level motor control. Finally, we provide baseline results for a subset of these tasks on distinct training curricula and corresponding evaluation protocols, verifying the feasibility of the tasks in this benchmark.

2021-01-11

ICLR.cc/2021/Conference (poster)

doi.org

openreview.net

Spatially Structured Recurrent Modules

Nasim Rahaman

Muhammad Waleed Gondal

Manuel Wuthrich

Yash Sharma

Bernhard Schölkopf

Capturing the structure of a data-generating process by means of appropriate inductive biases can help in learning models that generalise we… (voir plus)ll and are robust to changes in the input distribution. While methods that harness spatial and temporal structures find broad application, recent work has demonstrated the potential of models that leverage sparse and modular structure using an ensemble of sparingly interacting modules. In this work, we take a step towards dynamic models that are capable of simultaneously exploiting both modular and spatiotemporal structures. To this end, we model the dynamical system as a collection of autonomous but sparsely interacting sub-systems that interact according to a learned topology which is informed by the spatial structure of the underlying system. This gives rise to a class of models that are well suited for capturing the dynamics of systems that only offer local views into their state, along with corresponding spatial locations of those views. On the tasks of video prediction from cropped frames and multi-agent world modelling from partial observations in the challenging Starcraft2 domain, we find our models to be more robust to the number of available views and better capable of generalisation to novel tasks without additional training than strong baselines that perform equally well or better on the training distribution.

2021-01-11

ICLR.cc/2021/Conference (poster)

openreview.net

Toward Causal Representation Learning

Bernhard Schölkopf

Francesco Locatello

Nan Rosemary Ke

Nal Kalchbrenner

The two fields of machine learning and graphical causality arose and are developed separately. However, there is, now, cross-pollination and… (voir plus) increasing interest in both fields to benefit from the advances of the other. In this article, we review fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: we note that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, that is, the discovery of high-level causal variables from low-level observations. Finally, we delineate some implications of causality for machine learning and propose key research areas at the intersection of both communities.

2020-12-31

Proceedings of the IEEE (publié)

doi.org

S2RMs: Spatially Structured Recurrent Modules

Nasim Rahaman

Muhammad Waleed Gondal

Manuel Wuthrich

Y. Sharma

Bernhard Schölkopf

2020-07-12

ArXiv (prépublication)

Learning Neural Causal Models from Unknown Interventions

Nan Rosemary Ke

Christopher Pal

Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from obs… (voir plus)ervational data. However, there are theoretical limitations on the identifiability of underlying structures obtained from observational data alone. Interventional data provides much richer information about the underlying data-generating process. However, the extension and application of methods designed for observational data to include interventions is not straightforward and remains an open problem. In this paper we provide a general framework based on continuous optimization and neural networks to create models for the combination of observational and interventional data. The proposed method is even applicable in the challenging and realistic case that the identity of the intervened upon variable is unknown. We examine the proposed method in the setting of graph recovery both de novo and from a partially-known edge set. We establish strong benchmark results on several structure learning tasks, including structure recovery of both synthetic graphs as well as standard graphs from the Bayesian Network Repository.

2019-09-24

ArXiv (prépublication)

openreview.net

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Spyridon Bakas

Mauricio Reyes

Andras Jakab

Stefan. Bauer

Markus Rempfler

Alessandro Crimi

Russell T. Shinohara

Christoph Berger

Sung-min Ha

Martin Rozycki

Marcel W. Prastawa

Esther Alberts

Jana Lipková

John Freymann

Justin Kirby

Michel Bilello

Hassan M. Fathallah-Shaykh

Roland Wiest

J. Kirschke

Benedikt Wiestler … (voir 31 de plus)

Rivka R. Colen

Aikaterini Kotrotsou

Pamela LaMontagne

D. Marcus

Mikhail Milchenko

Arash Nazeri

Marc-André Weber

Abhishek Mahajan

Ujjwal Baid

Dongjin Kwon

Manu Agarwal

Mahbubul Alam

Alberto Albiol

A. Albiol

Alex A. Varghese

T. Tuan

Tal Arbel

Aaron J. Avery

Bobade Pranjal

Subhashis Banerjee

Thomas H. Batchelder

Nematollah Batmanghelich

Enzo Battistella

Martin Bendszus

E. Benson

José Bernal

George Biros

Mariano Cabezas

Siddhartha Chandra

Yi-Ju Chang

et al.

Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneo… (voir plus)us histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumoris a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses thestate-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross tota lresection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.

2018-11-04

ArXiv (prépublication)

doi.org