Portrait of Ruixiang Zhang is unavailable

Ruixiang Zhang

Alumni

Publications

Robust and Controllable Object-Centric Learning through Energy-based Models
Tong Che
Boris Ivanovic
Marco Pavone
Humans are remarkably good at understanding and reasoning about complex visual scenes. The capability of decomposing low-level observations … (see more)into discrete objects allows us to build a grounded abstract representation and identify the compositional structure of the world. Thus it is a crucial step for machine learning models to be capable of inferring objects and their properties from visual scene without explicit supervision. However, existing works on object-centric representation learning are either relying on tailor-made neural network modules or assuming sophisticated models of underlying generative and inference processes. In this work, we present EGO, a conceptually simple and general approach to learning object-centric representation through energy-based model. By forming a permutation-invariant energy function using vanilla attention blocks that are readily available in Transformers, we can infer object-centric latent variables via gradient-based MCMC methods where permutation equivariance is automatically guaranteed. We show that EGO can be easily integrated into existing architectures, and can effectively extract high-quality object-centric representations, leading to better segmentation accuracy and competitive downstream task performance. We empirically evaluate the robustness of the learned representation from EGO against distribution shift. Finally, we demonstrate the effectiveness of EGO in systematic compositional generalization, by recomposing learned energy functions for novel scene generation and manipulation.
Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models
Tong Che
Xiaofeng Liu
Site Li
Yubin Ge
Caiming Xiong
AI Safety is a major concern in many deep learning applications such as autonomous driving. Given a trained deep learning model, an importan… (see more)t natural problem is how to reliably verify the model's prediction. In this paper, we propose a novel framework --- deep verifier networks (DVN) to detect unreliable inputs or predictions of deep discriminative models, using separately trained deep generative models. Our proposed model is based on conditional variational auto-encoders with disentanglement constraints to separate the label information from the latent representation. We give both intuitive and theoretical justifications for the model. Our verifier network is trained independently with the prediction model, which eliminates the need of retraining the verifier network for a new model. We test the verifier network on both out-of-distribution detection and adversarial example detection problems, as well as anomaly detection problems in structured prediction tasks such as image caption generation. We achieve state-of-the-art results in all of these problems.
Perceptual Generative Autoencoders
Zijun Zhang
Zongpeng Li
Modern generative models are usually designed to match target distributions directly in the data space, where the intrinsic dimension of dat… (see more)a can be much lower than the ambient dimension. We argue that this discrepancy may contribute to the difficulties in training generative models. We therefore propose to map both the generated and target distributions to a latent space using the encoder of a standard autoencoder, and train the generator (or decoder) to match the target distribution in the latent space. Specifically, we enforce the consistency in both the data space and the latent space with theoretically justified data and latent reconstruction losses. The resulting generative model, which we call a perceptual generative autoencoder (PGA), is then trained with a maximum likelihood or variational autoencoder (VAE) objective. With maximum likelihood, PGAs generalize the idea of reversible generative models to unrestricted neural network architectures and arbitrary number of latent dimensions. When combined with VAEs, PGAs substantially improve over the baseline VAEs in terms of sample quality. Compared to other autoencoder-based generative models using simple priors, PGAs achieve state-of-the-art FID scores on CIFAR-10 and CelebA.
Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling
Tong Che
Jascha Sohl-Dickstein
Yuan Cao
We show that the sum of the implicit generator log-density …
MetaGAN: An Adversarial Approach to Few-Shot Learning
Tong Che
Zoubin Ghahramani
Yangqiu Song
In this paper, we propose a conceptually simple and general framework called MetaGAN for few-shot learning problems. Most state-of-the-art f… (see more)ew-shot classification models can be integrated with MetaGAN in a principled and straightforward way. By introducing an adversarial generator conditioned on tasks, we augment vanilla few-shot classification models with the ability to discriminate between real and fake data. We argue that this GAN-based approach can help few-shot classifiers to learn sharper decision boundary, which could generalize better. We show that with our MetaGAN framework, we can extend supervised few-shot learning models to naturally cope with unlabeled data. Different from previous work in semi-supervised few-shot learning, our algorithms can deal with semi-supervision at both sample-level and task-level. We give theoretical justifications of the strength of MetaGAN, and validate the effectiveness of MetaGAN on challenging few-shot image classification benchmarks.