Publications

Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning

Harry Zhao

Mingde Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Inspired by human conscious planning, we propose Skipper, a model-based reinforcement learning framework utilizing spatio-temporal abstracti… (see more)ons to generalize better in novel situations. It automatically decomposes the given task into smaller, more manageable subtasks, and thus enables sparse decision-making and focused computation on the relevant parts of the environment. The decomposition relies on the extraction of an abstracted proxy problem represented as a directed graph, in which vertices and edges are learned end-to-end from hindsight. Our theoretical analyses provide performance guarantees under appropriate assumptions and establish where our approach is expected to be helpful. Generalization-focused experiments validate Skipper’s significant advantage in zero-shot generalization, compared to some existing state-of-the-art hierarchical planning methods.

2024-01-16

ICLR.cc/2024/Conference (poster)

openreview.net

Course Correcting Koopman Representations

Mahan Fathi

Clement Gehring

Jonathan Pilault

David Kanaa

Pierre-Luc Bacon

Ross Goroshin

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Cycle Consistency Driven Object Discovery

Aniket Rajiv Didolkar

Anirudh Goyal

Yoshua Bengio

Developing deep learning models that effectively learn object-centric representations, akin to human cognition, remains a challenging task. … (see more)Existing approaches facilitate object discovery by representing objects as fixed-size vectors, called ``slots'' or ``object files''. While these approaches have shown promise in certain scenarios, they still exhibit certain limitations. First, they rely on architectural priors which can be unreliable and usually require meticulous engineering to identify the correct objects. Second, there has been a notable gap in investigating the practical utility of these representations in downstream tasks. To address the first limitation, we introduce a method that explicitly optimizes the constraint that each object in a scene should be associated with a distinct slot. We formalize this constraint by introducing consistency objectives which are cyclic in nature. By integrating these consistency objectives into various existing slot-based object-centric methods, we showcase substantial improvements in object-discovery performance. These enhancements consistently hold true across both synthetic and real-world scenes, underscoring the effectiveness and adaptability of the proposed approach. To tackle the second limitation, we apply the learned object-centric representations from the proposed method to two downstream reinforcement learning tasks, demonstrating considerable performance enhancements compared to conventional slot-based and monolithic representation learning methods. Our results suggest that the proposed approach not only improves object discovery, but also provides richer features for downstream tasks.

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

A Data-Driven Measure of Relative Uncertainty for Misclassification Detection

Eduardo Dadalto Câmara Gomes

Marco Romanelli

Georg Pichler

Pablo Piantanida

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Decoupling regularization from the action space

Sobhan Mohammadpour

Emma Frejinger

Pierre-Luc Bacon

Regularized reinforcement learning (RL), particularly the entropy-regularized kind, has gained traction in optimal control and inverse RL. W… (see more)hile standard unregularized RL methods remain unaffected by changes in the number of actions, we show that it can severely impact their regularized counterparts. This paper demonstrates the importance of decoupling the regularizer from the action space: that is, to maintain a consistent level of regularization regardless of how many actions are involved to avoid over-regularization. Whereas the problem can be avoided by introducing a task-specific temperature parameter, it is often undesirable and cannot solve the problem when action spaces are state-dependent. In the state-dependent action context, different states with varying action spaces are regularized inconsistently. We introduce two solutions: a static temperature selection approach and a dynamic counterpart, universally applicable where this problem arises. Implementing these changes improves performance on the DeepMind control suite in static and dynamic temperature regimes and a biological design task.

2024-01-16

ICLR.cc/2024/Conference (poster)

openreview.net

Decoupling regularization from the action space

Sobhan Mohammadpour

Emma Frejinger

Pierre-Luc Bacon

Regularized reinforcement learning (RL), particularly the entropy-regularized kind, has gained traction in optimal control and inverse RL. W… (see more)hile standard unregularized RL methods remain unaffected by changes in the number of actions, we show that it can severely impact their regularized counterparts. This paper demonstrates the importance of decoupling the regularizer from the action space: that is, to maintain a consistent level of regularization regardless of how many actions are involved to avoid over-regularization. Whereas the problem can be avoided by introducing a task-specific temperature parameter, it is often undesirable and cannot solve the problem when action spaces are state-dependent. In the state-dependent action context, different states with varying action spaces are regularized inconsistently. We introduce two solutions: a static temperature selection approach and a dynamic counterpart, universally applicable where this problem arises. Implementing these changes improves performance on the DeepMind control suite in static and dynamic temperature regimes and a biological design task.

2024-01-16

ICLR.cc/2024/Conference (poster)

openreview.net

Delta-AI: Local objectives for amortized inference in sparse graphical models

Jean-Pierre R. Falet

Hae Beom Lee

Nikolay Malkin

Chen Sun

Dragos Secrieru

Dinghuai Zhang

Guillaume Lajoie

Yoshua Bengio

We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call …

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

Dinghuai Zhang

Ricky T. Q. Chen

Cheng-Hao Liu

Aaron Courville

Yoshua Bengio

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

On Diffusion Modeling for Anomaly Detection

Victor Livernoche

Vineet Jain

Yashar Hezaveh

Siamak Ravanbakhsh

Known for their impressive performance in generative modeling, diffusion models are attractive candidates for density-based anomaly detectio… (see more)n. This paper investigates different variations of diffusion modeling for unsupervised and semi-supervised anomaly detection. In particular, we find that Denoising Diffusion Probability Models (DDPM) are performant on anomaly detection benchmarks yet computationally expensive. By simplifying DDPM in application to anomaly detection, we are naturally led to an alternative approach called Diffusion Time Estimation (DTE). DTE estimates the distribution over diffusion time for a given input and uses the mode or mean of this distribution as the anomaly score. We derive an analytical form for this density and leverage a deep neural network to improve inference efficiency. Through empirical evaluations on the ADBench benchmark, we demonstrate that all diffusion-based anomaly detection methods perform competitively for both semi-supervised and unsupervised settings. Notably, DTE achieves orders of magnitude faster inference time than DDPM, while outperforming it on this benchmark. These results establish diffusion-based anomaly detection as a scalable alternative to traditional methods and recent deep-learning techniques for standard unsupervised and semi-supervised anomaly detection settings.

2024-01-16

ICLR.cc/2024/Conference (spotlight)

doi.org

openreview.net

Efficient Dynamics Modeling in Interactive Environments with Koopman Theory

Arnab Kumar Mondal

Siba Smarak Panigrahi

Sai Rajeswar

Kaleem Siddiqi

Siamak Ravanbakhsh

The accurate modeling of dynamics in interactive environments is critical for successful long-range prediction. Such a capability could adva… (see more)nce Reinforcement Learning (RL) and Planning algorithms, but achieving it is challenging. Inaccuracies in model estimates can compound, resulting in increased errors over long horizons. We approach this problem from the lens of Koopman theory, where the nonlinear dynamics of the environment can be linearized in a high-dimensional latent space. This allows us to efficiently parallelize the sequential problem of long-range prediction using convolution while accounting for the agent’s action at every time step. Our approach also enables stability analysis and better control over gradients through time. Taken together, these advantages result in significant improvement over the existing approaches, both in the efficiency and the accuracy of modeling dynamics over extended horizons. We also show that this model can be easily incorporated into dynamics modeling for model-based planning and model-free RL and report promising experimental results.

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation

Divyat Mahajan

Ioannis Mitliagkas

Brady Neal

Vasilis Syrgkanis

2024-01-16

ICLR.cc/2024/Conference (spotlight)

openreview.net