Publications

Pretrained Language Models to Solve Graph Tasks in Natural Language
Frederik Wenkel
Boris Knyazev
Pretrained large language models (LLMs) are powerful learners in a variety of language tasks. We explore if LLMs can learn from graph-struct… (see more)ured data when the graphs are described using natural language. We explore data augmentation and pretraining specific to the graph domain and show that LLMs such as GPT-2 and GPT-3 are promising alternatives to graph neural networks.
RepoFusion: Training Code Models to Understand Your Repository
Disha Shrivastava
Denis Kocetkov
Harm de Vries
Torsten Scholak
Despite the huge success of Large Language Models (LLMs) in coding assistants like GitHub Copilot, these models struggle to understand the c… (see more)ontext present in the repository (e.g., imports, parent classes, files with similar names, etc.), thereby producing inaccurate code completions. This effect is more pronounced when using these assistants for repositories that the model has not seen during training, such as proprietary software or work-in-progress code projects. Recent work has shown the promise of using context from the repository during inference. In this work, we extend this idea and propose RepoFusion, a framework to train models to incorporate relevant repository context. Experiments on single-line code completion show that our models trained with repository context significantly outperform much larger code models as CodeGen-16B-multi (
Scaling Graphically Structured Diffusion Models
Christian Dietrich Weilbach
William Harvey
Hamed Shirzad
Applications of the recently introduced graphically structured diffusion model (GSDM) family show that sparsifying the transformer attention… (see more) mechanism within a diffusion model and meta-training on a variety of conditioning tasks can yield an efficiently learnable diffusion model artifact that is capable of flexible, in the sense of observing different subsets of variables at test-time, amortized conditioning in probabilistic graphical models. While extremely promising in terms of applicability and utility, implementations of GSDMs prior to this work were not scalable beyond toy graphical model sizes. We overcome this limitation by describing and and solving two scaling issues related to GSDMs; one engineering and one methodological. We additionally propose a new benchmark problem of weight inference for a convolutional neural network applied to
Score-based Enhanced Sampling for Protein Molecular Dynamics
Jiarui Lu
Bozitao Zhong
The dynamic nature of proteins is crucial for determining their biological functions and properties, and molecular dynamics (MD) simulations… (see more) stand as a predominant tool to study such phenomena. By utilizing empirically derived force fields, MD simulations explore the conformational space through numerically evolving the system along MD trajectories. However, the high-energy barrier of the force fields can hamper the exploration of MD, resulting in inadequately sampled ensemble. In this paper, we propose leveraging score-based generative models (SGMs) trained on large-scale general protein structures to perform protein con- formational sampling to complement traditional MD simulations. Experimental results demonstrate the effectiveness of our approach on several benchmark systems by comparing the results with long MD trajectories and state-of-the-art generative structure prediction models.
Simulation-Free Schrödinger Bridges via Score and Flow Matching
Alexander Tong
Nikolay Malkin
Kilian FATRAS
Lazar Atanackovic
Yanlei Zhang
Guillaume Huguet
We present simulation-free score and flow matching ([SF]…
Thompson Sampling for Improved Exploration in GFlowNets
Jarrid Rector-Brooks
Kanika Madan
Moksh J. Jain
Maksym Korablyov
Cheng-Hao Liu
Nikolay Malkin
Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over composition… (see more)al objects as a sequential decision-making problem with a learnable action policy. Unlike other algorithms for hierarchical sampling that optimize a variational bound, GFlowNet algorithms can stably run off-policy, which can be advantageous for discovering modes of the target distribution. Despite this flexibility in the choice of behaviour policy, the optimal way of efficiently selecting trajectories for training has not yet been systematically explored. In this paper, we view the choice of trajectories for training as an active learning problem and approach it using Bayesian techniques inspired by methods for multi-armed bandits. The proposed algorithm, Thompson sampling GFlowNets (TS-GFN), maintains an approximate posterior distribution over policies and samples trajectories from this posterior for training. We show in two domains that TS-GFN yields improved exploration and thus faster convergence to the target distribution than the off-policy exploration strategies used in past work.
Visual Chain-of-Thought Diffusion Models
William Harvey
Recent progress with conditional image diffusion models has been stunning, and this holds true whether we are speaking about models conditio… (see more)ned on a text description, a scene layout, or a sketch. Unconditional image diffusion models are also improving but lag behind, as do diffusion models which are conditioned on lower-dimensional features like class labels. We propose to close the gap between conditional and unconditional models using a two-stage sampling procedure. In the first stage we sample an embedding describing the semantic content of the image. In the second stage we sample the image conditioned on this embedding and then discard the embedding. Doing so lets us leverage the power of conditional diffusion models on the unconditional generation task, which we show improves FID by 25 - 50% compared to standard unconditional generation.
Evolving Computation Graphs
Andreea Deac
Graph neural networks (GNNs) have demonstrated success in modeling relational data, especially for data that exhibits homophily: when a conn… (see more)ection between nodes tends to imply that they belong to the same class. However, while this assumption is true in many relevant situations, there are important real-world scenarios that violate this assumption, and this has spurred research into improving GNNs for these cases. In this work, we propose Evolving Computation Graphs (ECGs), a novel method for enhancing GNNs on heterophilic datasets. Our approach builds on prior theoretical insights linking node degree, high homophily, and inter vs intra-class embedding similarity by rewiring the GNNs' computation graph towards adding edges that connect nodes that are likely to be in the same class. We utilise weaker classifiers to identify these edges, ultimately improving GNN performance on non-homophilic data as a result. We evaluate ECGs on a diverse set of recently-proposed heterophilous datasets and demonstrate improvements over the relevant baselines. ECG presents a simple, intuitive and elegant approach for improving GNN performance on heterophilic datasets without requiring prior domain knowledge.
AI Clinics on Mobile (AICOM): Universal AI Doctors for the Underserved and Hard-to-Reach
Tianyi Yang
Tianze Yang
Na An
Ao Kong
Shaoshan Liu
Doubly Right Object Recognition: A Why Prompt for Visual Rationales
Chengzhi Mao
Revant Teotia
Amrutha Sundar
Sachit Menon
Junfeng Yang
Xin Wang
Carl Vondrick
Many visual recognition models are evaluated only on their classification accuracy, a metric for which they obtain strong performance. In th… (see more)is paper, we investigate whether computer vision models can also provide correct rationales for their predictions. We propose a “doubly right” object recognition benchmark, where the metric requires the model to simultaneously produce both the right labels as well as the right rationales. We find that state-of-the-art visual models, such as CLIP, often provide incorrect rationales for their categorical predictions. However, by transferring the rationales from language models into visual representations through a tailored dataset, we show that we can learn a “why prompt,” which adapts large visual representations to produce correct rationales. Visualizations and empirical experiments show that our prompts significantly improve performance on doubly right object recognition, in addition to zero-shot transfer to unseen tasks and datasets.
Open-Set Likelihood Maximization for Few-Shot Learning
Malik Boudiaf
Etienne Bennequin
Myriam Tami
Antoine Toubhans
Celine Hudelot
Ismail Ben Ayed
We tackle the Few-Shot Open-Set Recognition (FSOSR) problem, i.e. classifying instances among a set of classes for which we only have a few … (see more)labeled samples, while simultaneously detecting instances that do not belong to any known class. We explore the popular transductive setting, which leverages the unlabelled query instances at inference. Motivated by the observation that existing transductive methods perform poorly in open-set scenarios, we propose a generalization of the maximum likelihood principle, in which latent scores down-weighing the influence of potential outliers are introduced alongside the usual parametric model. Our formulation embeds supervision constraints from the support set and additional penalties discouraging overconfident predictions on the query set. We proceed with a block-coordinate descent, with the latent scores and parametric model co-optimized alternately, thereby benefiting from each other. We call our resulting formulation Open-Set Likelihood Optimization (OSLO). OSLO is interpretable and fully modular; it can be applied on top of any pre-trained model seamlessly. Through extensive experiments, we show that our method surpasses existing inductive and transductive methods on both aspects of open-set recognition, namely inlier classification and outlier detection. Code is available at https://github.com/ebennequin/few-shot-open-set.
Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
Anas Mahmoud
Jordan S. K. Hu
Tianshu Kuai
Ali Harakeh
Steven L. Waslander
An effective framework for learning 3D representations for perception tasks is distilling rich self-supervised image features via contrastiv… (see more)e learning. However, image-to-point representation learning for autonomous driving datasets faces two main challenges: 1) the abundance of self-similarity, which results in the contrastive losses pushing away semantically similar point and image regions and thus disturbing the local semantic structure of the learned representations, and 2) severe class imbalance as pretraining gets dominated by over-represented classes. We propose to alleviate the self-similarity problem through a novel semantically tolerant image-to-point contrastive loss that takes into consideration the semantic distance between positive and negative image regions to minimize contrasting semantically similar point and image regions. Additionally, we address class imbalance by designing a class-agnostic balanced loss that approximates the degree of class imbalance through an aggregate sample-to-samples semantic similarity measure. We demonstrate that our semantically-tolerant contrastive loss with class balancing improves state-of-the-art 2D-to-3D representation learning in all evaluation settings on 3D semantic segmentation. Our method consistently outperforms state-of-the-art 2D-to-3D representation learning frameworks across a wide range of 2D self-supervised pretrained models.