Publications

Delta-AI: Local Objectives for Amortized Inference in Sparse Graphical Models

Hae Beom Lee

Chen Sun

We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call …

2024-01-15

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Diffusion Generative Flow Samplers: Improving Learning Signals Through Partial Trajectory Optimization

Dinghuai Zhang

Ricky T. Q. Chen

Cheng-Hao Liu

Aaron Courville

Yoshua Bengio

We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine lear… (voir plus)ning and statistics. We extend recent sampling-based approaches that leverage controlled stochastic processes to model approximate samples from these target densities. The main drawback of these approaches is that the training objective requires full trajectories to compute, resulting in sluggish credit assignment issues due to use of entire trajectories and a learning signal present only at the terminal time. In this work, we present Diffusion Generative Flow Samplers (DGFS), a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments, via parameterizing an additional "flow function". Our method takes inspiration from the theory developed for generative flow networks (GFlowNets), allowing us to make use of intermediate learning signals. Through various challenging experiments, we demonstrate that DGFS achieves more accurate estimates of the normalization constant than closely-related prior methods.

2024-01-15

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

On Diffusion Modeling for Anomaly Detection

Known for their impressive performance in generative modeling, diffusion models are attractive candidates for density-based anomaly detectio… (voir plus)n. This paper investigates different variations of diffusion modeling for unsupervised and semi-supervised anomaly detection. In particular, we find that Denoising Diffusion Probability Models (DDPM) are performant on anomaly detection benchmarks yet computationally expensive. By simplifying DDPM in application to anomaly detection, we are naturally led to an alternative approach called Diffusion Time Estimation (DTE). DTE estimates the distribution over diffusion time for a given input and uses the mode or mean of this distribution as the anomaly score. We derive an analytical form for this density and leverage a deep neural network to improve inference efficiency. Through empirical evaluations on the ADBench benchmark, we demonstrate that all diffusion-based anomaly detection methods perform competitively for both semi-supervised and unsupervised settings. Notably, DTE achieves orders of magnitude faster inference time than DDPM, while outperforming it on this benchmark. These results establish diffusion-based anomaly detection as a scalable alternative to traditional methods and recent deep-learning techniques for standard unsupervised and semi-supervised anomaly detection settings.

2024-01-15

ICLR.cc/2024/Conference (spotlight)

doi.org

openreview.net

Efficient Dynamics Modeling in Interactive Environments with Koopman Theory

Arnab Kumar Mondal

Siba Smarak Panigrahi

Sai Rajeswar

Kaleem Siddiqi

Siamak Ravanbakhsh

The accurate modeling of dynamics in interactive environments is critical for successful long-range prediction. Such a capability could adva… (voir plus)nce Reinforcement Learning (RL) and Planning algorithms, but achieving it is challenging. Inaccuracies in model estimates can compound, resulting in increased errors over long horizons. We approach this problem from the lens of Koopman theory, where the nonlinear dynamics of the environment can be linearized in a high-dimensional latent space. This allows us to efficiently parallelize the sequential problem of long-range prediction using convolution while accounting for the agent's action at every time step. Our approach also enables stability analysis and better control over gradients through time. Taken together, these advantages result in significant improvement over the existing approaches, both in the efficiency and the accuracy of modeling dynamics over extended horizons. We also show that this model can be easily incorporated into dynamics modeling for model-based planning and model-free RL and report promising experimental results.

2024-01-15

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation

Divyat Mahajan

Ioannis Mitliagkas

Brady Neal

Vasilis Syrgkanis

2024-01-15

ICLR.cc/2024/Conference (spotlight)

openreview.net

Ensemble Distillation for Unsupervised Constituency Parsing

Behzad Shayegh

Yanshuai Cao

Xiaodan Zhu

Jackie CK Cheung

Lili Mou

2024-01-15

ICLR.cc/2024/Conference (poster)

openreview.net

Evaluating Representation Learning on the Protein Structure Universe

Arian R. Jamasb

Alex Morehead

Chaitanya K. Joshi

Zuobai Zhang

Kieran Didi

Simon Mathis

Charles Harris

Jian Tang

Jianlin Cheng

Pietro Lio

Tom L. Blundell

We introduce ProteinWorkshop, a comprehensive benchmark suite for representation learning on protein structures with Geometric Graph Neural … (voir plus)Networks. We consider large-scale pre-training and downstream tasks on both experimental and predicted structures to enable the systematic evaluation of the quality of the learned structural representation and their usefulness in capturing functional relationships for downstream tasks. We find that: (1) large-scale pretraining on AlphaFold structures and auxiliary tasks consistently improve the performance of both rotation-invariant and equivariant GNNs, and (2) more expressive equivariant GNNs benefit from pretraining to a greater extent compared to invariant models. We aim to establish a common ground for the machine learning and computational biology communities to rigorously compare and advance protein structure representation learning. Our open-source codebase reduces the barrier to entry for working with large protein structure datasets by providing: (1) storage-efficient dataloaders for large-scale structural databases including AlphaFoldDB and ESM Atlas, as well as (2) utilities for constructing new tasks from the entire PDB. ProteinWorkshop is available at: github.com/a-r-j/ProteinWorkshop.

2024-01-15

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Expected Flow Networks in Stochastic Environments and Two-Player Zero-Sum Games

Bilun Sun

Generative flow networks (GFlowNets) are sequential sampling models trained to match a given distribution. GFlowNets have been successfully … (voir plus)applied to various structured object generation tasks, sampling a diverse set of high-reward objects quickly. We propose expected flow networks (EFlowNets), which extend GFlowNets to stochastic environments. We show that EFlowNets outperform other GFlowNet formulations in stochastic tasks such as protein design. We then extend the concept of EFlowNets to adversarial environments, proposing adversarial flow networks (AFlowNets) for two-player zero-sum games. We show that AFlowNets learn to find above 80% of optimal moves in Connect-4 via self-play and outperform AlphaZero in tournaments.

2024-01-15

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Ghost on the Shell: An Expressive Representation of General 3D Shapes

Zhen Liu

Yao Feng

Yuliang Xiu

Weiyang Liu

Liam Paull

Michael J. Black

Bernhard Schölkopf

The creation of photorealistic virtual worlds requires the accurate modeling of 3D surface geometry for a wide range of objects. For this, m… (voir plus)eshes are appealing since they 1) enable fast physics-based rendering with realistic material and lighting, 2) support physical simulation, and 3) are memory-efficient for modern graphics pipelines. Recent work on reconstructing and statistically modeling 3D shape, however, has critiqued meshes as being topologically inflexible. To capture a wide range of object shapes, any 3D representation must be able to model solid, watertight, shapes as well as thin, open, surfaces. Recent work has focused on the former, and methods for reconstructing open surfaces do not support fast reconstruction with material and lighting or unconditional generative modelling. Inspired by the observation that open surfaces can be seen as islands floating on watertight surfaces, we parameterize open surfaces by defining a manifold signed distance field on watertight templates. With this parameterization, we further develop a grid-based and differentiable representation that parameterizes both watertight and non-watertight meshes of arbitrary topology. Our new representation, called Ghost-on-the-Shell (G-Shell), enables two important applications: differentiable rasterization-based reconstruction from multiview images and generative modelling of non-watertight meshes. We empirically demonstrate that G-Shell achieves state-of-the-art performance on non-watertight mesh reconstruction and generation tasks, while also performing effectively for watertight meshes.

2024-01-15

ICLR.cc/2024/Conference (présentation orale)

doi.org

openreview.net

GOAt: Explaining Graph Neural Networks via Graph Output Attribution

Shengyao Lu

Keith G Mills

Jiao He

Bang Liu

Di Niu

Understanding the decision-making process of Graph Neural Networks (GNNs) is crucial to their interpretability. Most existing methods for ex… (voir plus)plaining GNNs typically rely on training auxiliary models, resulting in the explanations remain black-boxed. This paper introduces Graph Output Attribution (GOAt), a novel method to attribute graph outputs to input graph features, creating GNN explanations that are faithful, discriminative, as well as stable across similar samples. By expanding the GNN as a sum of scalar products involving node features, edge features and activation patterns, we propose an efficient analytical method to compute contribution of each node or edge feature to each scalar product and aggregate the contributions from all scalar products in the expansion form to derive the importance of each node and edge. Through extensive experiments on synthetic and real-world data, we show that our method not only outperforms various state-of-the-art GNN explainers in terms of the commonly used fidelity metric, but also exhibits stronger discriminability, and stability by a remarkable margin.

2024-01-15

ICLR.cc/2024/Conference (poster)

openreview.net

Hallucination Detection and Hallucination Mitigation: An Investigation

Junliang Luo

Tianyu Li

Di Wu

M. Jenkin

Steve Liu

Gregory Dudek

2024-01-15

ArXiv (prépublication)

doi.org

arxiv.org

How Connectivity Structure Shapes Rich and Lazy Learning in Neural Circuits

Yuhan Helena Liu

Aristide Baratin

Jonathan Cornford

Stefan Mihalas

Eric Shea-Brown

Guillaume Lajoie

In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learn… (voir plus)ing dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biology, neural circuit connectivity could exhibit a low-rank structure and therefore differs markedly from the random initializations generally used for these studies. As such, here we investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime. Through both empirical and theoretical analyses, we discover that high-rank initializations typically yield smaller network changes indicative of lazier learning, a finding we also confirm with experimentally-driven initial connectivity in recurrent neural networks. Conversely, low-rank initialization biases learning towards richer learning. Importantly, however, as an exception to this rule, we find lazier learning can still occur with a low-rank initialization that aligns with task and data statistics. Our research highlights the pivotal role of initial weight structures in shaping learning regimes, with implications for metabolic costs of plasticity and risks of catastrophic forgetting.

2024-01-15

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications