Publications

Learning to Optimize with Recurrent Hierarchical Transformers

Abhinav Moudgil

Boris Knyazev

Guillaume Lajoie

Eugene Belilovsky

2023-06-19

ICML.cc/2023/Workshop/Frontiers4LCD (published)

openreview.net

Learning with Learning Awareness using Meta-Values

Tim Cooijmans

Milad Aghajohari

Aaron Courville

2023-06-19

ICML.cc/2023/Workshop/Frontiers4LCD (published)

openreview.net

Online Dynamic Submodular Optimization

Antoine Lesage-Landry

Julien Pallage

We propose new algorithms with provable performance for online binary optimization subject to general constraints and in dynamic settings. W… (see more)e consider the subset of problems in which the objective function is submodular. We propose the online submodular greedy algorithm (OSGA) which solves to optimality an approximation of the previous round loss function to avoid the NP-hardness of the original problem. We extend OSGA to a generic approximation function. We show that OSGA has a dynamic regret bound similar to the tightest bounds in online convex optimization with respect to the time horizon and the cumulative round optimum variation. For instances where no approximation exists or a computationally simpler implementation is desired, we design the online submodular projected gradient descent (OSPGD) by leveraging the Lova\'sz extension. We obtain a regret bound that is akin to the conventional online gradient descent (OGD). Finally, we numerically test our algorithms in two power system applications: fast-timescale demand response and real-time distribution network reconfiguration.

2023-06-19

ArXiv (preprint)

doi.org

arxiv.org

Pretrained Language Models to Solve Graph Tasks in Natural Language

Frederik Wenkel

Guy Wolf

Boris Knyazev

Pretrained large language models (LLMs) are powerful learners in a variety of language tasks. We explore if LLMs can learn from graph-struct… (see more)ured data when the graphs are described using natural language. We explore data augmentation and pretraining specific to the graph domain and show that LLMs such as GPT-2 and GPT-3 are promising alternatives to graph neural networks.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

RepoFusion: Training Code Models to Understand Your Repository

Disha Shrivastava

Denis Kocetkov

Harm de Vries

Dzmitry Bahdanau

Torsten Scholak

Despite the huge success of Large Language Models (LLMs) in coding assistants like GitHub Copilot, these models struggle to understand the c… (see more)ontext present in the repository (e.g., imports, parent classes, files with similar names, etc.), thereby producing inaccurate code completions. This effect is more pronounced when using these assistants for repositories that the model has not seen during training, such as proprietary software or work-in-progress code projects. Recent work has shown the promise of using context from the repository during inference. In this work, we extend this idea and propose RepoFusion, a framework to train models to incorporate relevant repository context. Experiments on single-line code completion show that our models trained with repository context significantly outperform much larger code models as CodeGen-16B-multi (

2023-06-19

ArXiv (preprint)

doi.org

openreview.net

Scaling Graphically Structured Diffusion Models

Christian Dietrich Weilbach

William Harvey

Hamed Shirzad

Frank Wood

Applications of the recently introduced graphically structured diffusion model (GSDM) family show that sparsifying the transformer attention… (see more) mechanism within a diffusion model and meta-training on a variety of conditioning tasks can yield an efficiently learnable diffusion model artifact that is capable of flexible, in the sense of observing different subsets of variables at test-time, amortized conditioning in probabilistic graphical models. While extremely promising in terms of applicability and utility, implementations of GSDMs prior to this work were not scalable beyond toy graphical model sizes. We overcome this limitation by describing and and solving two scaling issues related to GSDMs; one engineering and one methodological. We additionally propose a new benchmark problem of weight inference for a convolutional neural network applied to

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

Score-based Enhanced Sampling for Protein Molecular Dynamics

Jiarui Lu

Bozitao Zhong

Jian Tang

The dynamic nature of proteins is crucial for determining their biological functions and properties, and molecular dynamics (MD) simulations… (see more) stand as a predominant tool to study such phenomena. By utilizing empirically derived force fields, MD simulations explore the conformational space through numerically evolving the system along MD trajectories. However, the high-energy barrier of the force fields can hamper the exploration of MD, resulting in inadequately sampled ensemble. In this paper, we propose leveraging score-based generative models (SGMs) trained on large-scale general protein structures to perform protein con- formational sampling to complement traditional MD simulations. Experimental results demonstrate the effectiveness of our approach on several benchmark systems by comparing the results with long MD trajectories and state-of-the-art generative structure prediction models.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

Simulation-Free Schrödinger Bridges via Score and Flow Matching

Alexander Tong

Nikolay Malkin

Kilian FATRAS

Lazar Atanackovic

Yanlei Zhang

Guillaume Huguet

Guy Wolf

Yoshua Bengio

We present simulation-free score and flow matching ([SF]…

2023-06-19

ICML.cc/2023/Workshop/Frontiers4LCD (published)

doi.org

openreview.net

Thompson Sampling for Improved Exploration in GFlowNets

Jarrid Rector-Brooks

Kanika Madan

Moksh J. Jain

Maksym Korablyov

Cheng-Hao Liu

Sarath Chandar Anbil Parthipan

Nikolay Malkin

Yoshua Bengio

Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over composition… (see more)al objects as a sequential decision-making problem with a learnable action policy. Unlike other algorithms for hierarchical sampling that optimize a variational bound, GFlowNet algorithms can stably run off-policy, which can be advantageous for discovering modes of the target distribution. Despite this flexibility in the choice of behaviour policy, the optimal way of efficiently selecting trajectories for training has not yet been systematically explored. In this paper, we view the choice of trajectories for training as an active learning problem and approach it using Bayesian techniques inspired by methods for multi-armed bandits. The proposed algorithm, Thompson sampling GFlowNets (TS-GFN), maintains an approximate posterior distribution over policies and samples trajectories from this posterior for training. We show in two domains that TS-GFN yields improved exploration and thus faster convergence to the target distribution than the off-policy exploration strategies used in past work.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

doi.org

openreview.net

Visual Chain-of-Thought Diffusion Models

William Harvey

Frank Wood

Recent progress with conditional image diffusion models has been stunning, and this holds true whether we are speaking about models conditio… (see more)ned on a text description, a scene layout, or a sketch. Unconditional image diffusion models are also improving but lag behind, as do diffusion models which are conditioned on lower-dimensional features like class labels. We propose to close the gap between conditional and unconditional models using a two-stage sampling procedure. In the first stage we sample an embedding describing the semantic content of the image. In the second stage we sample the image conditioned on this embedding and then discard the embedding. Doing so lets us leverage the power of conditional diffusion models on the unconditional generation task, which we show improves FID by 25 - 50% compared to standard unconditional generation.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

doi.org

openreview.net

Evolving Computation Graphs

Andreea Deac

Jian Tang

Graph neural networks (GNNs) have demonstrated success in modeling relational data, especially for data that exhibits homophily: when a conn… (see more)ection between nodes tends to imply that they belong to the same class. However, while this assumption is true in many relevant situations, there are important real-world scenarios that violate this assumption, and this has spurred research into improving GNNs for these cases. In this work, we propose Evolving Computation Graphs (ECGs), a novel method for enhancing GNNs on heterophilic datasets. Our approach builds on prior theoretical insights linking node degree, high homophily, and inter vs intra-class embedding similarity by rewiring the GNNs' computation graph towards adding edges that connect nodes that are likely to be in the same class. We utilise weaker classifiers to identify these edges, ultimately improving GNN performance on non-homophilic data as a result. We evaluate ECGs on a diverse set of recently-proposed heterophilous datasets and demonstrate improvements over the relevant baselines. ECG presents a simple, intuitive and elegant approach for improving GNN performance on heterophilic datasets without requiring prior domain knowledge.

2023-06-18

ICML.cc/2023/Workshop/TAGML (poster)

doi.org

openreview.net

AI Clinics on Mobile (AICOM): Universal AI Doctors for the Underserved and Hard-to-Reach

Tianyi Yang

Tianze Yang

Na An

Ao Kong

Shaoshan Liu

Xue (Steve) Liu

2023-06-17