Generating Contradictory, Neutral, and Entailing Sentences
Yikang Shen
Shawn Tan
Chin-Wei Huang
Learning distributed sentence representations remains an interesting problem in the field of Natural Language Processing (NLP). We want to l… (see more)earn a model that approximates the conditional latent space over the representations of a logical antecedent of the given statement. In our paper, we propose an approach to generating sentences, conditioned on an input sentence and a logical inference label. We do this by modeling the different possibilities for the output sentence as a distribution over the latent representation, which we train using an adversarial objective. We evaluate the model using two state-of-the-art models for the Recognizing Textual Entailment (RTE) task, and measure the BLEU scores against the actual sentences as a probe for the diversity of sentences produced by our model. The experiment results show that, given our framework, we have clear ways to improve the quality and diversity of generated sentences.
A polynomial algorithm for a continuous bilevel knapsack problem
Andrea Lodi
Patrice Marcotte
Learning Anonymized Representations with Adversarial Neural Networks
Clément Feutry
P. Duhamel
Statistical methods protecting sensitive information or the identity of the data owner have become critical to ensure privacy of individuals… (see more) as well as of organizations. This paper investigates anonymization methods based on representation learning and deep neural networks, and motivated by novel information theoretical bounds. We introduce a novel training objective for simultaneously training a predictor over target variables of interest (the regular labels) while preventing an intermediate representation to be predictive of the private labels. The architecture is based on three sub-networks: one going from input to representation, one from representation to predicted regular labels, and one from representation to predicted private labels. The training procedure aims at learning representations that preserve the relevant part of the information (about regular labels) while dismissing information about the private labels which correspond to the identity of a person. We demonstrate the success of this approach for two distinct classification versus anonymization tasks (handwritten digits and sentiment analysis).
Generalization in Machine Learning via Analytical Learning Theory
Kenji Kawaguchi
This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions. Based on this the… (see more)ory, a new regularization method in deep learning is derived and shown to outperform previous methods in CIFAR-10, CIFAR-100, and SVHN. Moreover, the proposed theory provides a theoretical basis for a family of practically successful regularization methods in deep learning. We discuss several consequences of our results on one-shot learning, representation learning, deep learning, and curriculum learning. Unlike statistical learning theory, the proposed learning theory analyzes each problem instance individually via measure theory, rather than a set of problem instances via statistics. As a result, it provides different types of results and insights when compared to statistical learning theory.
Towards Understanding Generalization via Analytical Learning Theory
Kenji Kawaguchi
This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions. Based on this the… (see more)ory, a new regularization method in deep learning is derived and shown to outperform previous methods in CIFAR-10, CIFAR-100, and SVHN. Moreover, the proposed theory provides a theoretical basis for a family of practically successful regularization methods in deep learning. We discuss several consequences of our results on one-shot learning, representation learning, deep learning, and curriculum learning. Unlike statistical learning theory, the proposed learning theory analyzes each problem instance individually via measure theory, rather than a set of problem instances via statistics. As a result, it provides different types of results and insights when compared to statistical learning theory.
Boundary Seeking GANs
Athul Jacob
Adam Trischler
Gerry Che
Kyunghyun Cho
Boundary Seeking GANs
Athul Jacob
Adam Trischler
Gerry Che
Kyunghyun Cho
Generative adversarial networks are a learning framework that rely on training a discriminator to estimate a measure of difference between a… (see more) target and generated distributions. GANs, as normally formulated, rely on the generated samples being completely differentiable w.r.t. the generative parameters, and thus do not work for discrete data. We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator. The importance weights have a strong connection to the decision boundary of the discriminator, and we call our method boundary-seeking GANs (BGANs). We demonstrate the effectiveness of the proposed algorithm with discrete image and character-based natural language generation. In addition, the boundary-seeking objective extends to continuous data, which can be used to improve stability of training, and we demonstrate this on Celeba, Large-scale Scene Understanding (LSUN) bedrooms, and Imagenet without conditioning.
Combining Model-based and Model-free RL via Multi-step Control Variates
Tong Che
Yuchen Lu
George Tucker
Surya Bhupatiraju
Shane Gu
Sergey Levine
Existence of Nash Equilibria on Integer Programming Games
Andrea Lodi
João Pedro Pedroso
Learning Generative Models with Locally Disentangled Latent Factors
One of the most successful techniques in generative models has been decomposing a complicated generation task into a series of simpler gener… (see more)ation tasks. For example, generating an image at a low resolution and then learning to refine that into a high resolution image often improves results substantially. Here we explore a novel strategy for decomposing generation for complicated objects in which we first generate latent variables which describe a subset of the observed variables, and then map from these latent variables to the observed space. We show that this allows us to achieve decoupled training of complicated generative models and present both theoretical and experimental results supporting the benefit of such an approach.
Online Hyper-Parameter Optimization
Damien Vincent
Sylvain Gelly
Olivier Bousquet
Finding Flatter Minima with SGD
Stanisław Jastrzębski
Zac Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Amos Storkey