Frank Wood

Visual Chain-of-Thought Diffusion Models

William Harvey

Recent progress with conditional image diffusion models has been stunning, and this holds true whether we are speaking about models conditio… (see more)ned on a text description, a scene layout, or a sketch. Unconditional image diffusion models are also improving but lag behind, as do diffusion models which are conditioned on lower-dimensional features like class labels. We propose to close the gap between conditional and unconditional models using a two-stage sampling procedure. In the first stage we sample an embedding describing the semantic content of the image. In the second stage we sample the image conditioned on this embedding and then discard the embedding. Doing so lets us leverage the power of conditional diffusion models on the unconditional generation task, which we show improves FID by 25 - 50% compared to standard unconditional generation.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

Realistically distributing object placements in synthetic training data improves the performance of vision-based object detection models

Setareh Dabiri

Vasileios Lioutas

Berend Zwartsenberg

Yunpeng Liu

Matthew Niedoba

Xiaoxuan Liang

Dylan Green

Justice Sefas

Jonathan Wilder Lavington

Adam Ścibior

When training object detection models on synthetic data, it is important to make the distribution of synthetic data as close as possible to … (see more)the distribution of real data. We investigate specifically the impact of object placement distribution, keeping all other aspects of synthetic data fixed. Our experiment, training a 3D vehicle detection model in CARLA and testing on KITTI, demonstrates a substantial improvement resulting from improving the object placement distribution.

2023-05-24

ArXiv (preprint)

arxiv.org

Conditional Permutation Invariant Flows

Berend Zwartsenberg

Adam Ścibior

Matthew Niedoba

Vasileios Lioutas

Yunpeng Liu

Justice Sefas

Setareh Dabiri

Jonathan Wilder Lavington

Trevor Campbell

We present a novel, conditional generative probabilistic model of set-valued data with a tractable log density. This model is a continuous n… (see more)ormalizing flow governed by permutation equivariant dynamics. These dynamics are driven by a learnable per-set-element term and pairwise interactions, both parametrized by deep neural networks. We illustrate the utility of this model via applications including (1) complex traffic scene generation conditioned on visually specified map information, and (2) object bounding box generation conditioned directly on images. We train our model by maximizing the expected likelihood of labeled conditional data under our flow, with the aid of a penalty that ensures the dynamics are smooth and hence efficiently solvable. Our method significantly outperforms non-permutation invariant baselines in terms of log likelihood and domain-specific metrics (offroad, collision, and combined infractions), yielding realistic samples that are difficult to distinguish from real data.

2023-05-15

TMLR (accepted)

Graphically Structured Diffusion Models

Christian Dietrich Weilbach

William Harvey

2023-04-24

ICML.cc/2023/Conference (poster)

Critic Sequential Monte Carlo

Vasileios Lioutas

Jonathan Wilder Lavington

Justice Sefas

Matthew Niedoba

Yunpeng Liu

Berend Zwartsenberg

Setareh Dabiri

Adam Ścibior

We introduce CriticSMC, a new algorithm for planning as inference built from a composition of sequential Monte Carlo with learned Soft-Q fun… (see more)ction heuristic factors. These heuristic factors, obtained from parametric approximations of the marginal likelihood ahead, more effectively guide SMC towards the desired target distribution, which is particularly helpful for planning in environments with hard constraints placed sparsely in time. Compared with previous work, we modify the placement of such heuristic factors, which allows us to cheaply propose and evaluate large numbers of putative action particles, greatly increasing inference and planning efficiency. CriticSMC is compatible with informative priors, whose density function need not be known, and can be used as a model-free control algorithm. Our experiments on collision avoidance in a high-dimensional simulated driving task show that CriticSMC significantly reduces collision rates at a low computational cost while maintaining realism and diversity of driving behaviors across vehicles and environment scenarios.

2023-02-01

ICLR.cc/2023/Conference (poster)

Graphically Structured Diffusion Models

Christian Dietrich Weilbach

William Harvey

We introduce a framework for automatically defining and learning deep generative models with problem-specific structure. We tackle problem d… (see more)omains that are more traditionally solved by algorithms such as sorting, constraint satisfaction for Sudoku, and matrix factorization. Concretely, we train diffusion models with an architecture tailored to the problem specification. This problem specification should contain a graphical model describing relationships between variables, and often benefits from explicit representation of subcomputations. Permutation invariances can also be exploited. Across a diverse set of experiments we improve the scaling relationship between problem dimension and our model's performance, in terms of both training time and final accuracy.

2023-01-01

ICML (published)

Video Killed the HD-Map: Predicting Multi-Agent Behavior Directly From Aerial Images

Yunpeng Liu

Vasileios Lioutas

Jonathan Wilder Lavington

Matthew Niedoba

Justice Sefas

Setareh Dabiri

Dylan Green

Xiaoxuan Liang

Berend Zwartsenberg

Adam Ścibior

The development of algorithms that learn multi-agent behavioral models using human demonstrations has led to increasingly realistic simulati… (see more)ons in the field of autonomous driving. In general, such models learn to jointly predict trajectories for all controlled agents by exploiting road context information such as drivable lanes obtained from manually annotated high-definition (HD) maps. Recent studies show that these models can greatly benefit from increasing the amount of human data available for training. However, the manual annotation of HD maps which is necessary for every new location puts a bottleneck on efficiently scaling up human traffic datasets. We propose an aerial image-based map (AIM) representation that requires minimal annotation and provides rich road context information for traffic agents like pedestrians and vehicles. We evaluate multi-agent trajectory prediction using the AIM by incorporating it into a differentiable driving simulator as an image-texture-based differentiable rendering module. Our results demonstrate competitive multi-agent trajectory prediction performance especially for pedestrians in the scene when using our AIM representation as compared to models trained with rasterized HD maps.

2023-01-01

2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC) (published)

arxiv.org

TITRATED: Learned Human Driving Behavior without Infractions via Amortized Inference

Vasileios Lioutas

Adam Ścibior

2022-08-12

TMLR (accepted)

Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

William Harvey

Michael Teng

Hard visual attention is a promising approach to reduce the computational burden of modern computer vision methodologies. However, hard atte… (see more)ntion mechanisms can be difficult and slow to train, which is especially costly for applications like neural architecture search where multiple networks must be trained. We introduce a method to amortise the cost of training by generating an extra supervision signal for a subset of the training data. This supervision is in the form of sequences of ‘good’ locations to attend to for each image. We find that the best method to generate supervision sequences comes from framing hard attention for image classification as a Bayesian optimal experimental design (BOED) problem. From this perspective, the optimal locations to attend to are those which provide the greatest expected reduction in the entropy of the classification distribution. We introduce methodology from the BOED literature to approximate this optimal behaviour and generate ‘near-optimal’ supervision sequences. We then present a hard attention network training objective that makes use of these sequences and show that it allows faster training than prior work. We finally demonstrate the utility of faster hard attention training by incorporating supervision sequences in a neural architecture search, resulting in hard attention architectures which can outperform networks with access to the entire image.

2022-07-18

2022 International Joint Conference on Neural Networks (IJCNN) (published)

arxiv.org

Probabilistic surrogate networks for simulators with unbounded randomness

Andreas Munk

Berend Zwartsenberg

Adam Ścibior

Atilim Güneş Baydin

Andrew Lawrence Stewart

Goran Fernlund

Anoush Poursartip

We present a framework for automatically structuring and training fast, approximate, deep neural surrogates of stochastic simulators. Unlike… (see more) traditional approaches to surrogate modeling, our surrogates retain the interpretable structure and control flow of the reference simulator. Our surrogates target stochastic simulators where the number of random variables itself can be stochastic and potentially unbounded. Our framework further enables an automatic replacement of the reference simulator with the surrogate when undertaking amortized inference. The fidelity and speed of our surrogates allow for both faster stochastic simulation and accurate and substantially faster posterior inference. Using an illustrative yet non-trivial example we show our surrogates' ability to accurately model a probabilistic program with an unbounded number of random variables. We then proceed with an example that shows our surrogates are able to accurately model a complex structure like an unbounded stack in a program synthesis example. We further demonstrate how our surrogate modeling technique makes amortized inference in complex black-box simulators an order of magnitude faster. Specifically, we do simulator-based materials quality testing, inferring safety-critical latent internal temperature profiles of composite materials undergoing curing.

2022-01-01

UAI (published)

proceedings.mlr.press

Planning as Inference in Epidemiological Models