Ruixiang Zhang

Robust and Controllable Object-Centric Learning through Energy-based Models

Tong Che

Boris Ivanovic

Marco Pavone

Humans are remarkably good at understanding and reasoning about complex visual scenes. The capability of decomposing low-level observations … (see more)into discrete objects allows us to build a grounded abstract representation and identify the compositional structure of the world. Thus it is a crucial step for machine learning models to be capable of inferring objects and their properties from visual scene without explicit supervision. However, existing works on object-centric representation learning are either relying on tailor-made neural network modules or assuming sophisticated models of underlying generative and inference processes. In this work, we present EGO, a conceptually simple and general approach to learning object-centric representation through energy-based model. By forming a permutation-invariant energy function using vanilla attention blocks that are readily available in Transformers, we can infer object-centric latent variables via gradient-based MCMC methods where permutation equivariance is automatically guaranteed. We show that EGO can be easily integrated into existing architectures, and can effectively extract high-quality object-centric representations, leading to better segmentation accuracy and competitive downstream task performance. We empirically evaluate the robustness of the learned representation from EGO against distribution shift. Finally, we demonstrate the effectiveness of EGO in systematic compositional generalization, by recomposing learned energy functions for novel scene generation and manipulation.

2023-02-01

ICLR.cc/2023/Conference (poster)

doi.org

openreview.net

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models

Tong Che

Xiaofeng Liu

Site Li

Yubin Ge

Ruixiang Zhang

Caiming Xiong

Yoshua Bengio

AI Safety is a major concern in many deep learning applications such as autonomous driving. Given a trained deep learning model, an importan… (see more)t natural problem is how to reliably verify the model's prediction. In this paper, we propose a novel framework --- deep verifier networks (DVN) to detect unreliable inputs or predictions of deep discriminative models, using separately trained deep generative models. Our proposed model is based on conditional variational auto-encoders with disentanglement constraints to separate the label information from the latent representation. We give both intuitive and theoretical justifications for the model. Our verifier network is trained independently with the prediction model, which eliminates the need of retraining the verifier network for a new model. We test the verifier network on both out-of-distribution detection and adversarial example detection problems, as well as anomaly detection problems in structured prediction tasks such as image caption generation. We achieve state-of-the-art results in all of these problems.

2021-05-18

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Perceptual Generative Autoencoders

Zijun Zhang

Ruixiang Zhang

Zongpeng Li

Yoshua Bengio

Liam Paull

Modern generative models are usually designed to match target distributions directly in the data space, where the intrinsic dimension of dat… (see more)a can be much lower than the ambient dimension. We argue that this discrepancy may contribute to the difficulties in training generative models. We therefore propose to map both the generated and target distributions to a latent space using the encoder of a standard autoencoder, and train the generator (or decoder) to match the target distribution in the latent space. Specifically, we enforce the consistency in both the data space and the latent space with theoretically justified data and latent reconstruction losses. The resulting generative model, which we call a perceptual generative autoencoder (PGA), is then trained with a maximum likelihood or variational autoencoder (VAE) objective. With maximum likelihood, PGAs generalize the idea of reversible generative models to unrestricted neural network architectures and arbitrary number of latent dimensions. When combined with VAEs, PGAs substantially improve over the baseline VAEs in terms of sample quality. Compared to other autoencoder-based generative models using simple priors, PGAs achieve state-of-the-art FID scores on CIFAR-10 and CelebA.

2020-11-21

Proceedings of the 37th International Conference on Machine Learning (published)

proceedings.mlr.press

openreview.net

Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling

Tong Che

Ruixiang Zhang

Jascha Sohl-Dickstein

Hugo Larochelle

Liam Paull

Yuan Cao

Yoshua Bengio

We show that the sum of the implicit generator log-density …

arxiv.org

MetaGAN: An Adversarial Approach to Few-Shot Learning

Ruixiang Zhang

Tong Che

Zoubin Ghahramani

Yoshua Bengio

Yangqiu Song

In this paper, we propose a conceptually simple and general framework called MetaGAN for few-shot learning problems. Most state-of-the-art f… (see more)ew-shot classification models can be integrated with MetaGAN in a principled and straightforward way. By introducing an adversarial generator conditioned on tasks, we augment vanilla few-shot classification models with the ability to discriminate between real and fake data. We argue that this GAN-based approach can help few-shot classifiers to learn sharper decision boundary, which could generalize better. We show that with our MetaGAN framework, we can extend supervised few-shot learning models to naturally cope with unlabeled data. Different from previous work in semi-supervised few-shot learning, our algorithms can deal with semi-supervision at both sample-level and task-level. We give theoretical justifications of the strength of MetaGAN, and validate the effectiveness of MetaGAN on challenging few-shot image classification benchmarks.

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Supervision Requests

Ruixiang Zhang

Publications

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Supervision Requests

Popular keywords:

Ruixiang Zhang

Publications