Fast and Flexible Successive-Cancellation List Decoders for Polar Codes
Seyyed Ali Hashemi
Carlo Condo
Polar codes have gained significant amount of attention during the past few years and have been selected as a coding scheme for the next gen… (see more)eration of mobile broadband standard. Among decoding schemes, successive-cancellation list (SCL) decoding provides a reasonable tradeoff between the error-correction performance and hardware implementation complexity when used to decode polar codes, at the cost of limited throughput. The simplified SCL (SSCL) and its extension SSCL-SPC increase the speed of decoding by removing redundant calculations when encountering particular information and frozen bit patterns (rate one and single parity check codes), while keeping the error-correction performance unaltered. In this paper, we improve SSCL and SSCL-SPC by proving that the list size imposes a specific number of path splitting required to decode rate one and single parity check codes. Thus, the number of splitting can be limited while guaranteeing exactly the same error-correction performance as if the paths were forked at each bit estimation. We call the new decoding algorithms Fast-SSCL and Fast-SSCL-SPC. Moreover, we show that the number of path forks in a practical application can be tuned to achieve desirable speed, while keeping the error-correction performance almost unchanged. Hardware architectures implementing both algorithms are then described and implemented: It is shown that our design can achieve
Fraternal Dropout
Konrad Żołna
Devansh Arpit
Dendi Suhubdy
Graph Attention Networks
Petar Veličković
Guillem Cucurull
Arantxa Casanova
Pietro Lio
Graph Attention Networks
Petar Veličković
Guillem Cucurull
Arantxa Casanova
Pietro Lio
Automatic Differentiation in Myia
Olivier Breuleux
Bart van Merriënboer
Automatic differentiation is an essential feature of machine learning frameworks. However, its implementation in existing frameworks often h… (see more)as limitations. In dataflow programming frameworks such as Theano or TensorFlow the representation used makes supporting higher-order gradients difficult. On the other hand, operator overloading frameworks such as PyTorch are flexible, but do not lend themselves well to optimization. With Myia, we attempt to have the best of both worlds: Building on the work by Pearlmutter and Siskind we implement a first-order gradient operator for a subset of the Python programming language.
A Multisensor Multi-Bernoulli Filter
Augustin-Alexandru Saucan
In this paper, we derive a multisensor multi-Bernoulli (MS-MeMBer) filter for multitarget tracking. Measurements from multiple sensors are e… (see more)mployed by the proposed filter to update a set of tracks modeled as a multi-Bernoulli random finite set. An exact implementation of the MS-MeMBer update procedure is computationally intractable. We propose an efficient approximate implementation by using a greedy measurement partitioning mechanism. The proposed filter allows for Gaussian mixture or particle filter implementations. Numerical simulations conducted for both linear-Gaussian and nonlinear models highlight the improved accuracy of the MS-MeMBer filter and its reduced computational load with respect to the multisensor cardinalized probability hypothesis density filter and the iterated-corrector cardinality-balanced multi-Bernoulli filter especially for low probabilities of detection.
Bayesian Hypernetworks
Chin-Wei Huang
Riashat Islam
Ryan Turner
Alexandre Lacoste
We propose Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork, h, is a neura… (see more)l network which learns to transform a simple noise distribution, p(e) = N(0,I), to a distribution q(t) := q(h(e)) over the parameters t of another neural network (the ``primary network). We train q with variational inference, using an invertible h to enable efficient estimation of the variational lower bound on the posterior p(t | D) via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of q(t). In practice, Bayesian hypernets provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.
Bayesian Hypernetworks
Chin-Wei Huang
Riashat Islam
Ryan Turner
Alexandre Lacoste
Learning Independent Features with Adversarial Nets for Non-linear ICA
Philemon Brakel
Reliable measures of statistical dependence could potentially be useful tools for learning independent features and performing tasks like so… (see more)urce separation using Independent Component Analysis (ICA). Unfortunately, many of such measures, like the mutual information, are hard to estimate and optimize directly. We propose to learn independent features with adversarial objectives (Goodfellow et al. 2014, Arjovsky et al. 2017) which optimize such measures implicitly. These objectives compare samples from the joint distribution and the product of the marginals without the need to compute any probability densities. We also propose two methods for obtaining samples from the product of the marginals using either a simple resampling trick or a separate parametric distribution. Our experiments show that this strategy can easily be applied to different types of model architectures and solve both linear and non-linear ICA problems.
Learnable Explicit Density for Continuous Latent Space and Variational Inference
Chin-Wei Huang
Ahmed Touati
Laurent Dinh
Michal Drozdzal
Mohammad Havaei
In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its correspon… (see more)ding posterior. First, we decompose the learning of VAEs into layerwise density estimation, and argue that having a flexible prior is beneficial to both sample generation and inference. Second, we analyze the family of inverse autoregressive flows (inverse AF) and show that with further improvement, inverse AF could be used as universal approximation to any complicated posterior. Our analysis results in a unified approach to parameterizing a VAE, without the need to restrict ourselves to use factorial Gaussians in the latent real space.
Neural Network Based Nonlinear Weighted Finite Automata
Weighted finite automata (WFA) can expressively model functions defined over strings but are inherently linear models. Given the recent succ… (see more)esses of nonlinear models in machine learning, it is natural to wonder whether ex-tending WFA to the nonlinear setting would be beneficial. In this paper, we propose a novel model of neural network based nonlinearWFA model (NL-WFA) along with a learning algorithm. Our learning algorithm is inspired by the spectral learning algorithm for WFAand relies on a nonlinear decomposition of the so-called Hankel matrix, by means of an auto-encoder network. The expressive power of NL-WFA and the proposed learning algorithm are assessed on both synthetic and real-world data, showing that NL-WFA can lead to smaller model sizes and infer complex grammatical structures from data.
Predicting Future Disease Activity and Treatment Responders for Multiple Sclerosis Patients Using a Bag-of-Lesions Brain Representation
Andrew Doyle
Douglas Arnold