S2RMs: Spatially Structured Recurrent Modules
Nasim Rahaman
Anirudh Goyal
Muhammad Waleed Gondal
Manuel Wüthrich
Stefan Bauer
Y. Sharma
Bernhard Schölkopf
Material for IEEE Software paper "How Do Open Source Software Contributors Perceive and Address Usability?"
Wenting Wang
Jinghui Cheng
Attenuated Anticipation of Social and Monetary Rewards in Autism Spectrum Disorders
Sarah Baumeister
Carolin Moessnang
Nico Bast
Sarah Hohmann
Julian Tillmann
David Goyard
Tony Charman
Sara Ambrosino
Simon Baron-Cohen
Christian Beckmann
Sven Bölte
Thomas Bourgeron
Annika Rausch
Daisy Crawley
Flavio Dell’Acqua
Sarah Durston
Christine Ecker
Dorothea L. Floris
Vincent Frouin … (see 19 more)
Hannah Hayward
Rosemary Holt
Mark Johnson
Emily J. H. Jones
Meng-Chuan Lai
Michael V. Lombardo
Luke Mason
Marianne Oldehinkel
Tony Persico
Antonia San José Cáceres
Thomas Wolfers
Will Spooren
Eva Loth
Declan Murphy
Jan K. Buitelaar
Heike Tost
Andreas Meyer-Lindenberg
Tobias Banaschewski
Daniel Brandeis
Background Reward processing has been proposed to underpin atypical social behavior, a core feature of autism spectrum disorder (ASD). Howev… (see more)er, previous neuroimaging studies have yielded inconsistent results regarding the specificity of atypicalities for social rewards in ASD. Utilizing a large sample, we aimed to assess altered reward processing in response to reward type (social, monetary) and reward phase (anticipation, delivery) in ASD. Methods Functional magnetic resonance imaging during social and monetary reward anticipation and delivery was performed in 212 individuals with ASD (7.6-30.5 years) and 181 typically developing (TD) participants (7.6-30.8 years). Results Across social and monetary reward anticipation, whole-brain analyses (p0.05, family-wise error-corrected) showed hypoactivation of the right ventral striatum (VS) in ASD. Further, region of interest (ROI) analy
Deep interpretability for GWAS
Deepak Sharma
Marc-andr'e Legault
Louis-philippe Lemieux Perreault
Audrey Lemaccon
Marie-Pierre Dub'e
Genome-Wide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases. In the… (see more)se studies, association testing is done on a variant-by-variant basis, possibly missing out on non-linear interaction effects between variants. Deep networks can be used to model these interactions, but they are difficult to train and interpret on large genetic datasets. We propose a method that uses the gradient based deep interpretability technique named DeepLIFT to show that known diabetes genetic risk factors can be identified using deep models along with possibly novel associations.
Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs
Kian Ahrabian
Daniel Tarlow
Hehuimin Cheng
We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest socia… (see more)l coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events.
Compositional Generalization by Factorizing Alignment and Translation
Jacob Russin
Jason Jo
R. O’Reilly
Counterexamples on the Monotonicity of Delay Optimal Strategies for Energy Harvesting Transmitters
Borna Sayedana
We consider cross-layer design of delay optimal transmission strategies for energy harvesting transmitters where the data and energy arrival… (see more) processes are stochastic. Using Markov decision theory, we show that the value function is weakly increasing in the queue state and weakly decreasing in the battery state. It is natural to expect that the delay optimal policy should be weakly increasing in the queue and battery states. We show via counterexamples that this is not the case. In fact, we show that for some sample scenarios the delay optimal policy may perform 5–13% better than the best monotone policy.
Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach
Wenyu Du
Zhouhan Lin
Yikang Shen
Yue Sara Zhang
It is commonly believed that knowledge of syntactic structure should improve language modeling. However, effectively and computationally eff… (see more)iciently incorporating syntactic structure into neural language models has been a challenging topic. In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called “syntactic distances”, where information between these two separate objectives shares the same intermediate representation. Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.
Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition
Assya Trofimov
Joseph Paul Cohen
Claude Perreault
Handling Black Swan Events in Deep Learning with Diversely Extrapolated Neural Networks
Maxime Wabartha
Vincent Francois-Lavet
By virtue of their expressive power, neural networks (NNs) are well suited to fitting large, complex datasets, yet they are also known to … (see more)produce similar predictions for points outside the training distribution. As such, they are, like humans, under the influence of the Black Swan theory: models tend to be extremely "surprised" by rare events, leading to potentially disastrous consequences, while justifying these same events in hindsight. To avoid this pitfall, we introduce DENN, an ensemble approach building a set of Diversely Extrapolated Neural Networks that fits the training data and is able to generalize more diversely when extrapolating to novel data points. This leads DENN to output highly uncertain predictions for unexpected inputs. We achieve this by adding a diversity term in the loss function used to train the model, computed at specific inputs. We first illustrate the usefulness of the method on a low-dimensional regression problem. Then, we show how the loss can be adapted to tackle anomaly detection during classification, as well as safe imitation learning problems.
On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)
Vincent Francois-Lavet
Damien Ernst
Raphael Fonteneau
When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: … (see more)a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.
Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions
Arjun Reddy Akula
Spandana Gella
Yaser Al-Onaizan
Song-Chun Zhu
Visual referring expression recognition is a challenging task that requires natural language understanding in the context of an image. We cr… (see more)itically examine RefCOCOg, a standard benchmark for this task, using a human study and show that 83.7% of test instances do not require reasoning on linguistic structure, i.e., words are enough to identify the target object, the word order doesn’t matter. To measure the true progress of existing models, we split the test set into two sets, one which requires reasoning on linguistic structure and the other which doesn’t. Additionally, we create an out-of-distribution dataset Ref-Adv by asking crowdworkers to perturb in-domain examples such that the target object changes. Using these datasets, we empirically show that existing methods fail to exploit linguistic structure and are 12% to 23% lower in performance than the established progress for this task. We also propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT, the current state-of-the-art model for this task. Our datasets are publicly available at https://github.com/aws/aws-refcocog-adv.