Publications

On Variational Learning of Controllable Representations for Text without Supervision
Peng Xu
Jackie Chi Kit Cheung
Yanshuai Cao
The variational autoencoder (VAE) can learn the manifold of natural images on certain datasets, as evidenced by meaningful interpolating or … (see more)extrapolating in the continuous latent space. However, on discrete data such as text, it is unclear if unsupervised learning can discover similar latent space that allows controllable manipulation. In this work, we find that sequence VAEs trained on text fail to properly decode when the latent codes are manipulated, because the modified codes often land in holes or vacant regions in the aggregated posterior latent space, where the decoding network fails to generalize. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and performs manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method outperforms unsupervised baselines and strong supervised approaches on text style transfer, and is capable of performing more flexible fine-grained control over text generation than existing methods.
A Brief Look at Generalization in Visual Meta-Reinforcement Learning
Due to the realization that deep reinforcement learning algorithms trained on high-dimensional tasks can strongly overfit to their training … (see more)environments, there have been several studies that investigated the generalization performance of these algorithms. However, there has been no similar study that evaluated the generalization performance of algorithms that were specifically designed for generalization, i.e. meta-reinforcement learning algorithms. In this paper, we assess the generalization performance of these algorithms by leveraging high-dimensional, procedurally generated environments. We find that these algorithms can display strong overfitting when they are evaluated on challenging tasks. We also observe that scalability to high-dimensional tasks with sparse rewards remains a significant problem among many of the current meta-reinforcement learning algorithms. With these results, we highlight the need for developing meta-reinforcement learning algorithms that can both generalize and scale.
Chaotic Continual Learning
Training a deep neural network requires the model to go over training data for several epochs and update network parameters. In continual le… (see more)arning, this process results in catastrophic forgetting which is one of the core issues of this domain. Most proposed approaches for this issue try to compensate for the effects of parameter updates in the batch incremental setup in which the training model visits a lot of samples for several epochs. However, it is not realistic to expect training data will always be fed to model in a batch incremental setup. This paper proposes a chaotic stream learner that mimics the chaotic behavior of biological neurons and does not updates network parameters. In addition, it can work with fewer samples compared to deep learning models on stream learning setup. Our experiments on MNIST, CIFAR10, and Omniglot show that the chaotic stream learner has less catastrophic forgetting by its nature in comparison to a CNN model in continual learning.
Historical Issue Data of Projects on Jira
A. Nicholson
Deeksha M. Arya
Jin L.C. Guo
S2RMs: Spatially Structured Recurrent Modules
Nasim Rahaman
Muhammad Waleed Gondal
Manuel Wuthrich
Y. Sharma
Bernhard Schölkopf
Towards an Unsupervised Method for Model Selection in Few-Shot Learning
Christopher Pal
The study of generalization of neural networks in gradient-based meta-learning has recently great research interest. Previous work on the st… (see more)udy of the objective landscapes within the scope of few-shot classification empirically demonstrated that generalization to new tasks might be linked to the average inner product between their respective gradients vectors (Guiroy et al., 2019). Following that work, we study the effect that meta-training has on the learned space of representation of the network. Notably, we demonstrate that the global similarity in the space of representation, measured by the average inner product between the embeddings of meta-test examples, also correlates to generalization. Based on these observations, we propose a novel model-selection criterion for gradient-based meta-learning and experimentally validate its effectiveness.
Vision-Based Goal-Conditioned Policies for Underwater Navigation in the Presence of Obstacles.
Travis Manderson
Juan Camilo Gamboa Higuera
Stefan Wapnick
Florian Shkurti
We present Nav2Goal, a data-efficient and end-to-end learning method for goal-conditioned visual navigation. Our technique is used to train … (see more)a navigation policy that enables a robot to navigate close to sparse geographic waypoints provided by a user without any prior map, all while avoiding obstacles and choosing paths that cover user-informed regions of interest. Our approach is based on recent advances in conditional imitation learning. General-purpose, safe and informative actions are demonstrated by a human expert. The learned policy is subsequently extended to be goal-conditioned by training with hindsight relabelling, guided by the robot's relative localization system, which requires no additional manual annotation. We deployed our method on an underwater vehicle in the open ocean to collect scientifically relevant data of coral reefs, which allowed our robot to operate safely and autonomously, even at very close proximity to the coral. Our field deployments have demonstrated over a kilometer of autonomous visual navigation, where the robot reaches on the order of 40 waypoints, while collecting scientifically relevant data. This is done while travelling within 0.5 m altitude from sensitive corals and exhibiting significant learned agility to overcome turbulent ocean conditions and to actively avoid collisions.
SVRG for Policy Evaluation with Fewer Gradient Evaluations
Stochastic variance-reduced gradient (SVRG) is an optimization method originally designed for tackling machine learning problems with a fini… (see more)te sum structure. SVRG was later shown to work for policy evaluation, a problem in reinforcement learning in which one aims to estimate the value function of a given policy. SVRG makes use of gradient estimates at two scales. At the slower scale, SVRG computes a full gradient over the whole dataset, which could lead to prohibitive computation costs. In this work, we show that two variants of SVRG for policy evaluation could significantly diminish the number of gradient calculations while preserving a linear convergence speed. More importantly, our theoretical result implies that one does not need to use the entire dataset in every epoch of SVRG when it is applied to policy evaluation with linear function approximation. Our experiments demonstrate large computational savings provided by the proposed methods.
Material for IEEE Software paper "How Do Open Source Software Contributors Perceive and Address Usability?"
Wenting Wang
Jinghui Cheng
Jin L.C. Guo
Attenuated Anticipation of Social and Monetary Rewards in Autism Spectrum Disorders
Sarah Baumeister
Carolin Moessnang
Nico Bast
Sarah Hohmann
Julian Tillmann
David Goyard
Tony Charman
Sara Ambrosino
Simon Baron-Cohen
Christian Beckmann
Sven Bölte
Thomas Bourgeron
Annika Rausch
Daisy Crawley
Flavio Dell’Acqua
Sarah Durston
Christine Ecker
Dorothea L. Floris
Vincent Frouin … (see 19 more)
Hannah Hayward
Rosemary Holt
Mark Johnson
Emily J. H. Jones
Meng-Chuan Lai
Michael V. Lombardo
Luke Mason
Marianne Oldehinkel
Tony Persico
Antonia San José Cáceres
Thomas Wolfers
Will Spooren
Eva Loth
Declan Murphy
Jan K. Buitelaar
Heike Tost
Andreas Meyer-Lindenberg
Tobias Banaschewski
Daniel Brandeis
Background Reward processing has been proposed to underpin atypical social behavior, a core feature of autism spectrum disorder (ASD). Howev… (see more)er, previous neuroimaging studies have yielded inconsistent results regarding the specificity of atypicalities for social rewards in ASD. Utilizing a large sample, we aimed to assess altered reward processing in response to reward type (social, monetary) and reward phase (anticipation, delivery) in ASD. Methods Functional magnetic resonance imaging during social and monetary reward anticipation and delivery was performed in 212 individuals with ASD (7.6-30.5 years) and 181 typically developing (TD) participants (7.6-30.8 years). Results Across social and monetary reward anticipation, whole-brain analyses (p0.05, family-wise error-corrected) showed hypoactivation of the right ventral striatum (VS) in ASD. Further, region of interest (ROI) analy
Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs
Daniel Tarlow
Hehuimin Cheng
Jin L.C. Guo
We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest socia… (see more)l coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events.
Compositional Generalization by Factorizing Alignment and Translation
Jacob Russin
R. O’Reilly