Structured deep Fisher pruning for efficient facial trait classification
Qing Tian
James J. Clark
Approximate Exploration through State Abstraction
Although exploration in reinforcement learning is well understood from a theoretical point of view, provably correct methods remain impracti… (see more)cal. In this paper we study the interplay between exploration and approximation, what we call approximate exploration. Our main goal is to further our theoretical understanding of pseudo-count based exploration bonuses (Bellemare et al., 2016), a practical exploration scheme based on density modelling. As a warm-up, we quantify the performance of an exploration algorithm, MBIE-EB (Strehl and Littman, 2008), when explicitly combined with state aggregation. This allows us to confirm that, as might be expected, approximation allows the agent to trade off between learning speed and quality of the learned policy. Next, we show how a given density model can be related to an abstraction and that the corresponding pseudo-count bonus can act as a substitute in MBIE-EB combined with this abstraction, but may lead to either under- or over-exploration. Then, we show that a given density model also defines an implicit abstraction, and find a surprising mismatch between pseudo-counts derived either implicitly or explicitly. Finally we derive a new pseudo-count bonus alleviating this issue.
Domain Knowledge Discovery Guided by Software Trace Links
Natawut Monaikul
Jane Cleland-Huang
Software-intensive projects are specified and modeled using domain terminology. Knowledge of the domain terminology is necessary for perform… (see more)ing many Software Engineering tasks such as impact analysis, compliance verification, and safety certification. However, discovering domain terminology and reasoning about their interrelationships for highly technical software and system engineering domains is a complex task which requires significant domain expertise and human effort. In this paper, we present a novel approach for leveraging trace links in software intensive systems to guide the process of mining facts that contain domain knowledge. The trace links which drive our mining process, define relationships between artifacts such as regulations and requirements and enable a guided search through high-yield combinations of domain terms. Our proof-of-concept evaluation shows that our approach aids in the discovery of domain facts even in highly complex technical domains. These domain facts can provide support for a variety of Software Engineering activities. As a use case, we demonstrate how the mined facts can facilitate the task of project Q&A.
The Deconfounded Recommender: A Causal Inference Approach to Recommendation
Yixin Wang
Dawen Liang
David Blei
The goal of a recommender system is to show its users items that they will like. In forming its prediction, the recommender system tries to … (see more)answer: "what would the rating be if we 'forced' the user to watch the movie?" This is a question about an intervention in the world, a causal question, and so traditional recommender systems are doing causal inference from observational data. This paper develops a causal inference approach to recommendation. Traditional recommenders are likely biased by unobserved confounders, variables that affect both the "treatment assignments" (which movies the users watch) and the "outcomes" (how they rate them). We develop the deconfounded recommender, a strategy to leverage classical recommendation models for causal predictions. The deconfounded recommender uses Poisson factorization on which movies users watched to infer latent confounders in the data; it then augments common recommendation models to correct for potential confounding bias. The deconfounded recommender improves recommendation and it enjoys stable performance against interventions on test sets.
Generalization of Equilibrium Propagation to Vector Field Dynamics
Benjamin Scellier
Anirudh Goyal
Jonathan Binas
Thomas Mesnard
The biological plausibility of the backpropagation algorithm has long been doubted by neuroscientists. Two major reasons are that neurons wo… (see more)uld need to send two different types of signal in the forward and backward phases, and that pairs of neurons would need to communicate through symmetric bidirectional connections. We present a simple two-phase learning procedure for fixed point recurrent networks that addresses both these issues. In our model, neurons perform leaky integration and synaptic weights are updated through a local mechanism. Our learning method generalizes Equilibrium Propagation to vector field dynamics, relaxing the requirement of an energy function. As a consequence of this generalization, the algorithm does not compute the true gradient of the objective function, but rather approximates it at a precision which is proven to be directly related to the degree of symmetry of the feedforward and feedback weights. We show experimentally that our algorithm optimizes the objective function.
Irrelevance by inhibition: Learning, computation, and implications for schizophrenia
Nathan Insel
Jordan Guerguiev
Symptoms of schizophrenia may arise from a failure of cortical circuits to filter-out irrelevant inputs. Schizophrenia has also been linked … (see more)to disruptions in cortical inhibitory interneurons, consistent with the possibility that in the normally functioning brain, these cells are in some part responsible for determining which sensory inputs are relevant versus irrelevant. Here, we develop a neural network model that demonstrates how the cortex may learn to ignore irrelevant inputs through plasticity processes affecting inhibition. The model is based on the proposal that the amount of excitatory output from a cortical circuit encodes the expected magnitude of reward or punishment (“relevance”), which can be trained using a temporal difference learning mechanism acting on feedforward inputs to inhibitory interneurons. In the model, irrelevant and blocked stimuli drive lower levels of excitatory activity compared with novel and relevant stimuli, and this difference in activity levels is lost following disruptions to inhibitory units. When excitatory units are connected to a competitive-learning output layer with a threshold, the relevance code can be shown to “gate” both learning and behavioral responses to irrelevant stimuli. Accordingly, the combined network is capable of recapitulating published experimental data linking inhibition in frontal cortex with fear learning and expression. Finally, the model demonstrates how relevance learning can take place in parallel with other types of learning, through plasticity rules involving inhibitory and excitatory components, respectively. Altogether, this work offers a theory of how the cortex learns to selectively inhibit inputs, providing insight into how relevance-assignment problems may emerge in schizophrenia.
Relevance learning via inhibitory plasticity and its implications for schizophrenia
Nathan Insel
Jordan Guerguiev
Symptoms of schizophrenia may arise from a failure of cortical circuits to filter-out irrelevant inputs. Schizophrenia has also been linked … (see more)to disruptions to cortical inhibitory interneurons, consistent with the possibility that in the normally functioning brain, these cells are in some part responsible for determining which inputs are relevant and which irrelevant. Here, we develop an abstract but biologically plausible neural network model that demonstrates how the cortex may learn to ignore irrelevant inputs through plasticity processes affecting inhibition. The model is based on the proposal that the amount of excitatory output from a cortical circuit encodes expected magnitude of reward or punishment (”relevance”), which can be trained using a temporal difference learning mechanism acting on feed-forward inputs to inhibitory interneurons. The model exhibits learned irrelevance and blocking, which become impaired following disruptions to inhibitory units. When excitatory units are connected to a competitive-learning output layer, the relevance code is capable of modulating learning and activity. Accordingly, the combined network is capable of recapitulating published experimental data linking inhibition in frontal cortex with fear learning and expression. Finally, the model demonstrates how relevance learning can take place in parallel with other types of learning, through plasticity rules involving inhibitory and excitatory components respectively. Altogether, this work offers a theory of how the cortex learns to selectively inhibit inputs, providing insight into how relevance-assignment problems may emerge in schizophrenia.
Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning
Eric Larsen
Sébastien Lachapelle
Andrea Lodi
The paper provides a methodological contribution at the intersection of machine learning and operations research. Namely, we propose a metho… (see more)dology to quickly predict solution summaries (i.e., solution descriptions at a given level of detail) to discrete stochastic optimization problems. We approximate the solutions based on supervised learning and the training dataset consists of a large number of deterministic problems that have been solved independently and offline. Uncertainty regarding a missing subset of the inputs is addressed through sampling and aggregation methods. Our motivating application concerns booking decisions of intermodal containers on double-stack trains. Under perfect information, this is the so-called load planning problem and it can be formulated by means of integer linear programming. However, the formulation cannot be used for the application at hand because of the restricted computational budget and unknown container weights. The results show that standard deep learning algorithms allow one to predict descriptions of solutions with high accuracy in very short time (milliseconds or less).
Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information
Eric P. Larsen
Sébastien Lachapelle
Andrea Lodi
This paper offers a methodological contribution at the intersection of machine learning and operations research. Namely, we propose a method… (see more)ology to quickly predict expected tactical descriptions of operational solutions (TDOSs). The problem we address occurs in the context of two-stage stochastic programming, where the second stage is demanding computationally. We aim to predict at a high speed the expected TDOS associated with the second-stage problem, conditionally on the first-stage variables. This may be used in support of the solution to the overall two-stage problem by avoiding the online generation of multiple second-stage scenarios and solutions. We formulate the tactical prediction problem as a stochastic optimal prediction program, whose solution we approximate with supervised machine learning. The training data set consists of a large number of deterministic operational problems generated by controlled probabilistic sampling. The labels are computed based on solutions to these problems (solved independently and offline), employing appropriate aggregation and subselection methods to address uncertainty. Results on our motivating application on load planning for rail transportation show that deep learning models produce accurate predictions in very short computing time (milliseconds or less). The predictive accuracy is close to the lower bounds calculated based on sample average approximation of the stochastic prediction programs.
Attend Before you Act: Leveraging human visual attention for continual learning
When humans perform a task, such as playing a game, they selectively pay attention to certain parts of the visual input, gathering relevant … (see more)information and sequentially combining it to build a representation from the sensory data. In this work, we explore leveraging where humans look in an image as an implicit indication of what is salient for decision making. We build on top of the UNREAL architecture in DeepMind Lab's 3D navigation maze environment. We train the agent both with original images and foveated images, which were generated by overlaying the original images with saliency maps generated using a real-time spectral residual technique. We investigate the effectiveness of this approach in transfer learning by measuring performance in the context of noise in the environment.
Active Search of Connections for Case Building and Combating Human Trafficking
David Bayani
Artur Dubrawski
How can we help an investigator to efficiently connect the dots and uncover the network of individuals involved in a criminal activity based… (see more) on the evidence of their connections, such as visiting the same address, or transacting with the same bank account? We formulate this problem as Active Search of Connections, which finds target entities that share evidence of different types with a given lead, where their relevance to the case is queried interactively from the investigator. We present RedThread, an efficient solution for inferring related and relevant nodes while incorporating the user's feedback to guide the inference. Our experiments focus on case building for combating human trafficking, where the investigator follows leads to expose organized activities, i.e. different escort advertisements that are connected and possibly orchestrated. RedThread is a local algorithm and enables online case building when mining millions of ads posted in one of the largest classified advertising websites. The results of RedThread are interpretable, as they explain how the results are connected to the initial lead. We experimentally show that RedThread learns the importance of the different types and different pieces of evidence, while the former could be transferred between cases.
Negative Momentum for Improved Game Dynamics
Reyhane Askari Hemmat
Mohammad Pezeshki
Gabriel Huang
Rémi LE PRIOL
Games generalize the single-objective optimization paradigm by introducing different objective functions for different players. Differentiab… (see more)le games often proceed by simultaneous or alternating gradient updates. In machine learning, games are gaining new importance through formulations like generative adversarial networks (GANs) and actor-critic systems. However, compared to single-objective optimization, game dynamics are more complex and less understood. In this paper, we analyze gradient-based methods with momentum on simple games. We prove that alternating updates are more stable than simultaneous updates. Next, we show both theoretically and empirically that alternating gradient updates with a negative momentum term achieves convergence in a difficult toy adversarial problem, but also on the notoriously difficult to train saturating GANs.