Vincent Dumoulin

Learning a Universal Template for Few-shot Dataset Generalization

Eleni Triantafillou

Hugo Larochelle

Richard Zemel

Vincent Dumoulin

2021-01-01

ICML (published)

proceedings.mlr.press

arxiv.org

A Unified Few-Shot Classification Benchmark to Compare Transfer and Meta Learning Approaches

Vincent Dumoulin

Neil Houlsby

Utku Evci

Xiaohua Zhai

Ross Goroshin

Sylvain Gelly

Hugo Larochelle

Meta and transfer learning are two successful families of approaches to few-shot 1 learning. Despite highly related goals, state-of-the-art … (see more)advances in each family are 2 measured largely in isolation of each other. As a result of diverging evaluation 3 norms, a direct or thorough comparison of different approaches is challenging. 4 To bridge this gap, we introduce a few-shot classiﬁcation evaluation protocol 5 named VTAB+MD with the explicit goal of facilitating sharing of insights from 6 each community. We demonstrate its accessibility in practice by performing a 7 cross-family study of the best transfer and meta learners which report on both a 8 large-scale meta-learning benchmark (Meta-Dataset, MD), and a transfer learning 9 benchmark (Visual Task Adaptation Benchmark, VTAB). We ﬁnd that, on average, 10 large-scale transfer methods (Big Transfer, BiT) outperform competing approaches 11 on MD, even when trained only on ImageNet. In contrast, meta-learning approaches 12 struggle to compete on VTAB when trained and validated on MD. However, BiT 13 is not without limitations, and pushing for scale does not improve performance 14 on highly out-of-distribution MD tasks. We hope that this work contributes to 15 accelerating progress on few-shot learning research. 16

2021-01-01

NeurIPS Datasets and Benchmarks (published)

openreview.net

An Effective Anti-Aliasing Approach for Residual Networks

Cristina Vasconcelos

Image pre-processing in the frequency domain has traditionally played a vital role in computer vision and was even part of the standard pipe… (see more)line in the early days of deep learning. However, with the advent of large datasets, many practitioners concluded that this was unnecessary due to the belief that these priors can be learned from the data itself. Frequency aliasing is a phenomenon that may occur when sub-sampling any signal, such as an image or feature map, causing distortion in the sub-sampled output. We show that we can mitigate this effect by placing non-trainable blur filters and using smooth activation functions at key locations, particularly where networks lack the capacity to learn them. These simple architectural changes lead to substantial improvements in out-of-distribution generalization on both image classification under natural corruptions on ImageNet-C [10] and few-shot learning on Meta-Dataset [17], without introducing additional trainable parameters and using the default hyper-parameters of open source codebases.

2020-11-20

ArXiv (preprint)

arxiv.org

Feature-wise transformations

2018-07-09

Distill (published)

doi.org

FiLM: Visual Reasoning with a General Conditioning Layer

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence ne… (see more)ural network computation via a simple, feature-wise affine transformation based on conditioning information. We show that FiLM layers are highly effective for visual reasoning - answering image-related questions which require a multi-step, high-level process - a task which has proven difficult for standard deep learning methods that do not explicitly model reasoning. Specifically, we show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are robust to ablations and architectural modifications, and 4) generalize well to challenging, new data from few examples or even zero-shot.

2018-04-29

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

FiLM: Visual Reasoning with a General Conditioning Layer

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence ne… (see more)ural network computation via a simple, feature-wise affine transformation based on conditioning information. We show that FiLM layers are highly effective for visual reasoning - answering image-related questions which require a multi-step, high-level process - a task which has proven difficult for standard deep learning methods that do not explicitly model reasoning. Specifically, we show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are robust to ablations and architectural modifications, and 4) generalize well to challenging, new data from few examples or even zero-shot.

2017-09-01

ArXiv (preprint)

doi.org

arxiv.org

Learning Visual Reasoning Without Strong Priors

Achieving artificial visual reasoning - the ability to answer image-related questions which require a multi-step, high-level process - is an… (see more) important step towards artificial general intelligence. This multi-modal task requires learning a question-dependent, structured reasoning process over images from language. Standard deep learning approaches tend to exploit biases in the data rather than learn this underlying structure, while leading methods learn to visually reason successfully but are hand-crafted for reasoning. We show that a general-purpose, Conditional Batch Normalization approach achieves state-of-the-art results on the CLEVR Visual Reasoning benchmark with a 2.4% error rate. We outperform the next best end-to-end method (4.5%) and even methods that use extra supervision (3.1%). We probe our model to shed light on how it reasons, showing it has learned a question-dependent, multi-step process. Previous work has operated under the assumption that visual reasoning calls for a specialized architecture, but we show that a general architecture with proper conditioning can learn to visually reason effectively.

2017-07-10

ArXiv (preprint)

arxiv.org

Improved Training of Wasserstein GANs

Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserste… (see more)in GAN (WGAN) makes progress toward stable training of GANs, but sometimes can still generate only low-quality samples or fail to converge. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to undesired behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models over discrete data. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.

2017-03-31

ArXiv (preprint)

arxiv.org

Adversarially Learned Inference

Vincent Dumoulin

Ishmael Belghazi

Ben Poole

We introduce the adversarially learned inference (ALI) model, which jointly learns a generation network and an inference network using an ad… (see more)versarial process. The generation network maps samples from stochastic latent variables to the data space while the inference network maps training examples in data space to the space of latent variables. An adversarial game is cast between these two networks and a discriminative network is trained to distinguish between joint latent/data-space samples from the generative network and joint samples from the inference network. We illustrate the ability of the model to learn mutually coherent inference and generation networks through the inspections of model samples and reconstructions and confirm the usefulness of the learned representations by obtaining a performance competitive with state-of-the-art on the semi-supervised SVHN and CIFAR10 tasks.

2017-01-01

ICLR.cc/2017/conference (poster)

openreview.net

Improved Training of Wasserstein GANs

Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserste… (see more)in GAN (WGAN) makes progress toward stable training of GANs, but sometimes can still generate only low-quality samples or fail to converge. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to undesired behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models over discrete data. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.

arxiv.org

Theano: A Python framework for fast computation of mathematical expressions

Rami Al-rfou'

Guillaume Alain

Amjad Almahairi

Christof Angermüller

Dzmitry Bahdanau

Nicolas Ballas

Frédéric Bastien

Justin S. Bayer

A. Belikov

A. Belopolsky

J. Bergstra

Josh Bleecher Snyder

Nicolas Boulanger-Lewandowski

Xavier Bouthillier

Alexandre De Brébisson

Olivier Breuleux … (see 92 more)

Paul F. Christiano

Myriam Côté

Julien Demouth

Sander Dieleman

M'elanie Ducoffe

Samira Ebrahimi Kahou

Dumitru Erhan

Ziye Fan

Orhan Firat

Mathieu Germain

Xavier Glorot

Ian J. Goodfellow

Matthew Graham

Balázs Hidasi

Arjun Jain

S'ebastien Jean

Kai Jia

Mikhail V. Korobov

Vivek Kulkarni

Alex Lamb

Pascal Lamblin

Eric P. Larsen

César Laurent

S. Lee

Simon-mark Lefrancois

Simon Lemieux

Nicholas Léonard

Zhouhan Lin

J. Livezey

Cory R. Lorenz

Jeremiah L. Lowin

Qianli M. Ma

Pierre-Antoine Manzagol

R. McGibbon

Mehdi Mirza

Alberto Orlandi

Chris Pal

Razvan Pascanu

Mohammad Pezeshki

Colin Raffel

Daniel Renshaw

Matthew David Rocklin

Adriana Romero Soriano

Markus Dr. Roth

Peter Sadowski

John Salvatier

François Savard

Jan Schlüter

John D. Schulman

Gabriel Schwartz

Iulian V. Serban

Dmitriy Serdyuk

Samira Shabanian

Etienne Simon

Sigurd Spieckermann

S. Subramanyam

Jakub Sygnowski

Jeremie Tanguay

Gijs van Tulder

Joseph P. Turian

Sebastian Urban

Dustin J. Webb

M. Willson

Lijun Xue

Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (see more)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.

2016-05-09

ArXiv (preprint)

arxiv.org

Theano: A Python framework for fast computation of mathematical expressions

Rami Al-rfou'

Guillaume Alain

Amjad Almahairi

Christof Angermüller

Dzmitry Bahdanau

Nicolas Ballas

Frédéric Bastien

Justin S. Bayer

A. Belikov

A. Belopolsky

Josh Bleecher Snyder

Nicolas Boulanger-Lewandowski

Xavier Bouthillier

Alexandre De Brébisson

Olivier Breuleux … (see 92 more)

Paul F. Christiano

Myriam Côté

Julien Demouth

Sander Dieleman

M'elanie Ducoffe

Samira Ebrahimi Kahou

Dumitru Erhan

Ziye Fan

Orhan Firat

Mathieu Germain

Xavier Glorot

Ian G Goodfellow

Matthew Graham

Balázs Hidasi

Arjun Jain

Kai Jia

Mikhail V. Korobov

Vivek Kulkarni

Alex Lamb

Pascal Lamblin

Eric Larsen

César Laurent

S. Lee

Simon-mark Lefrancois

Simon Lemieux

Nicholas Léonard

Zhouhan Lin

J. Livezey

Cory R. Lorenz

Jeremiah L. Lowin

Qianli M. Ma

Pierre-Antoine Manzagol

R. McGibbon

Mehdi Mirza

Alberto Orlandi

Chris Pal

Razvan Pascanu

Mohammad Pezeshki

Colin Raffel

Daniel Renshaw

Matthew David Rocklin

Adriana Romero Soriano

Markus Dr. Roth

Peter Sadowski

John Salvatier

François Savard

Jan Schlüter

John D. Schulman

Gabriel Schwartz

Iulian V. Serban

Dmitriy Serdyuk

Samira Shabanian

Etienne Simon

Sigurd Spieckermann

S. Subramanyam

Gijs van Tulder

Sebastian Urban

Dustin J. Webb

M. Willson

Lijun Xue

Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (see more)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.

2016-05-09

ArXiv (preprint)

arxiv.org

Speed Science

Leading in a New Era

Supervision Requests

Vincent Dumoulin

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Vincent Dumoulin

Publications