Publications

On Adversarial Mixup Resynthesis

R Devon Hjelm

Christopher Pal

In this paper, we explore new approaches to combining information encoded within the learned representations of auto-encoders. We explore mo… (voir plus)dels that are capable of combining the attributes of multiple inputs such that a resynthesised output is trained to fool an adversarial discriminator for real versus synthesised data. Furthermore, we explore the use of such an architecture in the context of semi-supervised learning, where we learn a mixing function whose objective is to produce interpolations of hidden states, or masked combinations of latent representations that are consistent with a conditioned class label. We show quantitative and qualitative evidence that such a formulation is an interesting avenue of research.

2018-12-31

NeurIPS (publié)

dblp.uni-trier.de

Adversarial Mixup Resynthesizers

R Devon Hjelm

Christopher Pal

In this paper, we explore new approaches to combining information encoded within the learned representations of autoencoders. We explore mod… (voir plus)els that are capable of combining the attributes of multiple inputs such that a resynthesised output is trained to fool an adversarial discriminator for real versus synthesised data. Furthermore, we explore the use of such an architecture in the context of semi-supervised learning, where we learn a mixing function whose objective is to produce interpolations of hidden states, or masked combinations of latent representations that are consistent with a conditioned class label. We show quantitative and qualitative evidence that such a formulation is an interesting avenue of research.

2018-12-31

DGS@ICLR (publié)

Artificial Intelligence Cytometer in Blood

Yoshua Bengio

Geoffrey Hinton

2018-12-31

(publié)

www.semanticscholar.org

Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning

Thang Doan

Bogdan Mazoure

Audrey Durand

Joelle Pineau

R Devon Hjelm

Continuous control tasks in reinforcement learning are important because they provide an important framework for learning in high-dimensiona… (voir plus)l state spaces with deceptive rewards, where the agent can easily become trapped into suboptimal solutions. One way to avoid local optima is to use a population of agents to ensure coverage of the policy space, yet learning a population with the "best" coverage is still an open problem. In this work, we present a novel approach to population-based RL in continuous control that leverages properties of normalizing flows to perform attractive and repulsive operations between current members of the population and previously observed policies. Empirical results on the MuJoCo suite demonstrate a high performance gain for our algorithm compared to prior work, including Soft-Actor Critic (SAC).

2018-12-31

arXiv (prépublication)

Avoidance Learning Using Observational Reinforcement Learning

David Venuto

Léonard Boussioux

Junhao Wang

Rola Dali

Jhelum Chakravorty

Yoshua Bengio

Imitation learning seeks to learn an expert policy from sampled demonstrations. However, in the real world, it is often difficult to find a … (voir plus)perfect expert and avoiding dangerous behaviors becomes relevant for safety reasons. We present the idea of \textit{learning to avoid}, an objective opposite to imitation learning in some sense, where an agent learns to avoid a demonstrator policy given an environment. We define avoidance learning as the process of optimizing the agent's reward while avoiding dangerous behaviors given by a demonstrator. In this work we develop a framework of avoidance learning by defining a suitable objective function for these problems which involves the \emph{distance} of state occupancy distributions of the expert and demonstrator policies. We use density estimates for state occupancy measures and use the aforementioned distance as the reward bonus for avoiding the demonstrator. We validate our theory with experiments using a wide range of partially observable environments. Experimental results show that we are able to improve sample efficiency during training compared to state of the art policy optimization and safety methods.

2018-12-31

arXiv (prépublication)

arxiv.org

BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning

Maxime Chevalier-Boisvert

Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific … (voir plus)reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded language learning. The BabyAI platform comprises an extensible suite of 19 levels of increasing difficulty. The levels gradually lead the agent towards acquiring a combinatorially rich synthetic language which is a proper subset of English. The platform also provides a heuristic expert agent for the purpose of simulating a human teacher. We report baseline results and estimate the amount of human involvement that would be required to train a neural network-based agent on some of the BabyAI levels. We put forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties.

2018-12-31

ICLR.cc/2019/Conference (poster)

Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks

Sitao Luan

Mingde Zhao

Xiao-Wen Chang

Recently, neural network based approaches have achieved significant improvement for solving large, complex, graph-structured problems. Howev… (voir plus)er, their bottlenecks still need to be addressed, and the advantages of multi-scale information and deep architectures have not been sufficiently exploited. In this paper, we theoretically analyze how existing Graph Convolutional Networks (GCNs) have limited expressive power due to the constraint of the activation functions and their architectures. We generalize spectral graph convolution and deep GCN in block Krylov subspace forms and devise two architectures, both with the potential to be scaled deeper but each making use of the multi-scale information in different ways. We further show that the equivalence of these two architectures can be established under certain conditions. On several node classification tasks, with or without the help of validation, the two new architectures achieve better performance compared to many state-of-the-art methods.

2018-12-31

NeurIPS (publié)

arxiv.org

Community size effect in artificial learning systems

Olivier Tieleman

Angeliki Lazaridou

Shibl Mourad

Charles Blundell

Motivated by theories of language and communication that explain why communities with large numbers of speakers have, on average, simpler la… (voir plus)nguages with more regularity, we cast the representation learning problem in terms of learning to communicate . Our starting point sees the traditional autoencoder setup as a single encoder with a ﬁxed decoder partner that must learn to communicate. Generalizing from there, we introduce community -based autoencoders in which multiple encoders and decoders collectively learn representations by being randomly paired up on successive training iterations. We ﬁnd that increasing community sizes reduce idiosyncrasies in the learned codes, resulting in representations that better encode concept categories and correlate with human feature norms.

2018-12-31

ViGIL@NeurIPS (publié)

dblp.uni-trier.de

Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning ( Supplementary Material ) A Proofs

More precisely, the WFA A = (α, {A}σ∈Σ,Ω) with n states and the linear 2-RNN M = (α,A,Ω) with n hidden units, where A ∈ Rn×Σ×n … (voir plus)is defined by A:,σ,: = A for all σ ∈ Σ, are such that fA(σ1σ2 · · ·σk) = fM (x1,x2, · · · ,xk) for all sequences of input symbols σ1, · · · , σk ∈ Σ, where for each i ∈ [k] the input vector xi ∈ RΣ is the one-hot encoding of the symbol σi. Proof. We first show by induction on k that, for any sequence σ1 · · ·σk ∈ Σ∗, the hidden state hk computed by M (see Eq. (1)) on the corresponding one-hot encoded sequence x1, · · · ,xk ∈ R satisfies hk = (A1 · · ·Ak )>α. The case k = 0 is immediate. Suppose the result true for sequences of length up to k. One can check easily check that A •2 xi = Ai for any index i. Using the induction hypothesis it then follows that hk+1 = A •1 hk •2 xk+1 = Ak+1 •1 hk = (Ak+1)hk = (Aσk+1)>(Aσ1 · · ·Ak )>α = (A1 · · ·Aσk+1)>α.

2018-12-31

(publié)

www.semanticscholar.org

Data-driven Chance Constrained Programming based Electric Vehicle Penetration Analysis

Di Wu

Tracy Can Cui

Benoit Boulet

Transportation electrification has been growing rapidly in recent years. The adoption of electric vehicles (EVs) could help to release the d… (voir plus)ependency on oil and reduce greenhouse gas emission. However, the increasing EV adoption will also impose a high demand on the power grid and may jeopardize the grid network infrastructures. For certain high EV penetration areas, the EV charging demand may lead to transformer overloading at peak hours which makes the maximal EV penetration analysis an urgent problem to solve. This paper proposes a data-driven chance constrained programming based framework for maximal EV penetration analysis. Simulation results are presented for a real-world neighborhood level network. The proposed framework could serve as a guidance for utility companies to schedule infrastructure upgrades.

2018-12-31

(publié)

www.semanticscholar.org

An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation

Vincent Michalski

Vikram Voleti

Samira Ebrahimi Kahou

Anthony Ortiz

Pascal Vincent

Chris Pal

Batch normalization has been widely used to improve optimization in deep neural networks. While the uncertainty in batch statistics can act … (voir plus)as a regularizer, using these dataset statistics specific to the training set impairs generalization in certain tasks. Recently, alternative methods for normalizing feature activations in neural networks have been proposed. Among them, group normalization has been shown to yield similar, in some domains even superior performance to batch normalization. All these methods utilize a learned affine transformation after the normalization operation to increase representational power. Methods used in conditional computation define the parameters of these transformations as learnable functions of conditioning information. In this work, we study whether and where the conditional formulation of group normalization can improve generalization compared to conditional batch normalization. We evaluate performances on the tasks of visual question answering, few-shot learning, and conditional image generation.

2018-12-31

arXiv (prépublication)