Publications

Exploring Exchangeable Dataset Amortization for Bayesian Posterior Inference

Sarthak Mittal

Niels Leif Bracher

Priyank Jaini

Marcus A Brubaker

Bayesian inference provides a natural way of incorporating uncertainties and different underlying theories when making predictions or analyz… (voir plus)ing complex systems. However, it requires computationally expensive routines for approximation, which have to be re-run when new data is observed and are thus infeasible to efficiently scale and reuse. In this work, we look at the problem from the perspective of amortized inference to obtain posterior parameter distributions for known probabilistic models. We propose a neural network-based approach that can handle exchangeable observations and amortize over datasets to convert the problem of Bayesian posterior inference into a single forward pass of a network. Our empirical analyses explore various design choices for amortized inference by comparing: (a) our proposed variational objective with forward KL minimization, (b) permutation-invariant architectures like Transformers and DeepSets, and (c) parameterizations of posterior families like diagonal Gaussian and Normalizing Flows. Through our experiments, we successfully apply amortization techniques to estimate the posterior distributions for different domains solely through inference.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

GFlowNets for Causal Discovery: an Overview

Dragos Cristian Manta

Edward J Hu

Yoshua Bengio

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

Green Federated Learning

Ashkan Yousefpour

Shen Guo

Ashish Shenoy

Sayan Ghosh

Pierre Stock

Kiwan Maeng

Schalk-Willem Kruger

Michael Rabbat

Carole-Jean Wu

Ilya Mironov

2023-06-19

ICML.cc/2023/Workshop/FL (publié)

openreview.net

Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection

Vitória Barin-Pacela

Kartik Ahuja

Simon Lacoste-Julien

Pascal Vincent

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection

Vitória Barin-Pacela

Kartik Ahuja

Simon Lacoste-Julien

Pascal Vincent

Disentanglement aims to recover meaningful latent ground-truth factors from only the observed distribution. Identifiability provides the the… (voir plus)oretical grounding for disentanglement to be well-founded. Unfortunately, unsupervised identifiability of independent latent factors is a theoretically proven impossibility in the i.i.d. setting under a general nonlinear smooth map from factors to observations. In this work, we show that, remarkably, it is possible to recover discretized latent coordinates under a highly generic nonlinear smooth mapping (a diffeomorphism) without any additional inductive bias on the mapping. This is, assuming that latent density has axis-aligned discontinuity landmarks, but without making the unrealistic assumption of statistical independence of the factors. We introduce this novel form of identifiability, termed quantized coordinate identifiability , and provide a comprehensive proof of the recovery of discretized coordinates.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

doi.org

openreview.net

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

Pablo Samuel Castro

Tyler Kastner

Prakash Panangaden

Mark Rowland

We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We define a ne… (voir plus)w metric under this lens that is provably equivalent to the recently introduced MICo distance (Castro et al., 2021). The kernel perspective enables us to provide new theoretical results, including value-function bounds and low-distortion finite-dimensional Euclidean embeddings, which are crucial when using behavioural metrics for reinforcement learning representations. We complement our theory with strong empirical results that demonstrate the effectiveness of these methods in practice.

2023-06-19

TMLR (accepté)

doi.org

openreview.net

Learning to Optimize with Recurrent Hierarchical Transformers

Abhinav Moudgil

Boris Knyazev

Guillaume Lajoie

Eugene Belilovsky

2023-06-19

ICML.cc/2023/Workshop/Frontiers4LCD (publié)

openreview.net

Learning with Learning Awareness using Meta-Values

Tim Cooijmans

Milad Aghajohari

Aaron Courville

2023-06-19

ICML.cc/2023/Workshop/Frontiers4LCD (publié)

openreview.net

Online Dynamic Submodular Optimization

Antoine Lesage-Landry

Julien Pallage

We propose new algorithms with provable performance for online binary optimization subject to general constraints and in dynamic settings. W… (voir plus)e consider the subset of problems in which the objective function is submodular. We propose the online submodular greedy algorithm (OSGA) which solves to optimality an approximation of the previous round loss function to avoid the NP-hardness of the original problem. We extend OSGA to a generic approximation function. We show that OSGA has a dynamic regret bound similar to the tightest bounds in online convex optimization with respect to the time horizon and the cumulative round optimum variation. For instances where no approximation exists or a computationally simpler implementation is desired, we design the online submodular projected gradient descent (OSPGD) by leveraging the Lova\'sz extension. We obtain a regret bound that is akin to the conventional online gradient descent (OGD). Finally, we numerically test our algorithms in two power system applications: fast-timescale demand response and real-time distribution network reconfiguration.

2023-06-19

ArXiv (prépublication)

doi.org

arxiv.org

Pretrained Language Models to Solve Graph Tasks in Natural Language

Frederik Wenkel

Guy Wolf

Boris Knyazev

Pretrained large language models (LLMs) are powerful learners in a variety of language tasks. We explore if LLMs can learn from graph-struct… (voir plus)ured data when the graphs are described using natural language. We explore data augmentation and pretraining specific to the graph domain and show that LLMs such as GPT-2 and GPT-3 are promising alternatives to graph neural networks.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

RepoFusion: Training Code Models to Understand Your Repository

Disha Shrivastava

Denis Kocetkov

Harm de Vries

Dzmitry Bahdanau

Torsten Scholak

Despite the huge success of Large Language Models (LLMs) in coding assistants like GitHub Copilot, these models struggle to understand the c… (voir plus)ontext present in the repository (e.g., imports, parent classes, files with similar names, etc.), thereby producing inaccurate code completions. This effect is more pronounced when using these assistants for repositories that the model has not seen during training, such as proprietary software or work-in-progress code projects. Recent work has shown the promise of using context from the repository during inference. In this work, we extend this idea and propose RepoFusion, a framework to train models to incorporate relevant repository context. Experiments on single-line code completion show that our models trained with repository context significantly outperform much larger code models as CodeGen-16B-multi (

2023-06-19

ArXiv (prépublication)

doi.org

openreview.net

Scaling Graphically Structured Diffusion Models

Christian Dietrich Weilbach

William Harvey

Hamed Shirzad

Frank Wood

Applications of the recently introduced graphically structured diffusion model (GSDM) family show that sparsifying the transformer attention… (voir plus) mechanism within a diffusion model and meta-training on a variety of conditioning tasks can yield an efficiently learnable diffusion model artifact that is capable of flexible, in the sense of observing different subsets of variables at test-time, amortized conditioning in probabilistic graphical models. While extremely promising in terms of applicability and utility, implementations of GSDMs prior to this work were not scalable beyond toy graphical model sizes. We overcome this limitation by describing and and solving two scaling issues related to GSDMs; one engineering and one methodological. We additionally propose a new benchmark problem of weight inference for a convolutional neural network applied to

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Publications

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications