Publications

Task Loss Estimation for Sequence Prediction
Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks
Whereas deep neural networks were first mostly used for classification tasks, they are rapidly expanding in the realm of structured output p… (voir plus)roblems, where the observed target is composed of multiple random variables that have a rich joint distribution, given the input. In this paper we focus on the case where the input also has a rich structure and the input and output structures are somehow related. We describe systems that learn to attend to different places in the input, for each element of the output, for a variety of tasks: machine translation, image caption generation, video clip description, and speech recognition. All these systems are based on a shared set of building blocks: gated recurrent neural networks and convolutional neural networks, along with trained attention mechanisms. We report on experimental results with these systems, showing impressively good performance and the advantage of the attention mechanism.
Poisson Group Testing: A Probabilistic Model for Boolean Compressed Sensing
Olgica Milenkovic
We introduce a novel probabilistic group testing framework, termed Poisson group testing, in which the number of defectives follows a right-… (voir plus)truncated Poisson distribution. The Poisson model has a number of new applications, including dynamic testing with diminishing relative rates of defectives. We consider both nonadaptive and semi-adaptive identification methods. For nonadaptive methods, we derive a lower bound on the number of tests required to identify the defectives with a probability of error that asymptotically converges to zero; in addition, we propose test matrix constructions for which the number of tests closely matches the lower bound. For semiadaptive methods, we describe a lower bound on the expected number of tests required to identify the defectives with zero error probability. In addition, we propose a stage-wise reconstruction algorithm for which the expected number of tests is only a constant factor away from the lower bound. The methods rely only on an estimate of the average number of defectives, rather than on the individual probabilities of subjects being defective.
Clinical Image-Based Procedures. Translational Research in Medical Imaging
Ian J. Gerard
Marta Kersten-Oertel
Simon Drouin
Jeffery Alan Hall
Kevin Petrecca
Dante De Nigris
D. Collins
A Scalable Successive-Cancellation Decoder for Polar Codes
Alexandre J. Raymond
Warren J. Gross
Polar codes are the first error-correcting codes to provably achieve channel capacity, asymptotically in code length, with an explicit const… (voir plus)ruction. However, under successive-cancellation decoding, polar codes require very long code lengths to compete with existing modern codes. Nonetheless, the successive cancellation algorithm enables very-low-complexity implementations in hardware, due to the regular structure exhibited by polar codes. In this paper, we present an improved architecture for successive-cancellation decoding of polar codes, making use of a novel semi-parallel, encoder-based partial-sum computation module. We also provide quantization results for realistic code length N=215, and explore various optimization techniques such as a chained processing element and a variable quantization scheme. This design is shown to scale to code lengths of up to N=221, enabled by its low logic use, low register use and simple datapaths, limited almost exclusively by the amount of available SRAM. It also supports an overlapped loading of frames, allowing full-throughput decoding with a single set of input buffers.
Generative Adversarial Nets
Generative Adversarial Networks (GANs) are very popular frameworks for generating high-quality data, and are immensely used in both the acad… (voir plus)emia and industry in many domains. Arguably, their most substantial impact has been in the area of computer vision, where they achieve state-of-the-art image generation. This chapter gives an introduction to GANs, by discussing their principle mechanism and presenting some of their inherent problems during training and evaluation. We focus on these three issues: (1) mode collapse, (2) vanishing gradients, and (3) generation of low-quality images. We then list some architecture-variant and loss-variant GANs that remedy the above challenges. Lastly, we present two utilization examples of GANs for real-world applications: Data augmentation and face images generation.
High-Throughput Energy-Efficient LDPC Decoders Using Differential Binary Message Passing
Kevin Cushon
Saied Hemati
Camille Leroux
Shie Mannor
Warren J. Gross
In this paper, we present energy-efficient architectures for decoders of low-density parity check (LDPC) codes using the differential decodi… (voir plus)ng with binary message passing (DD-BMP) algorithm and its modified variant (MDD-BMP). We also propose an improved differential binary (IDB) decoding algorithm. These algorithms offer significant intrinsic advantages in the energy domain: simple computations, low interconnect complexity, and very high throughput, while achieving error correction performance up to within 0.25 dB of the offset min-sum algorithm. We report on fully parallel decoder implementations of (273, 191), (1023, 781), and (4095, 3367) finite geometry-based LDPC codes in 65 nm CMOS. Using the MDD-BMP algorithm, these decoders achieve respective areas of 0.28 mm2, 1.38 mm2, and 15.37 mm2, average throughputs of 37 Gbps, 75 Gbps, and 141 Gbps, and energy efficiencies of 4.9 pJ/bit, 13.2 pJ/bit, and 37.9 pJ/bit with a 1.0 V supply voltage in post-layout simulations. At a reduced supply voltage of 0.8 V, these decoders achieve respective throughputs of 26 Gbps, 54 Gbps, and 94 Gbps, and energy efficiencies of 3.1 pJ/bit, 8.2 pJ/bit, and 23.5 pJ/bit. We also report on a fully parallel implementation of IDB for the (2048, 1723) LDPC code specified in the IEEE 802.3an (10GBASE-T) standard. This decoder achieves an area of 1.44 mm2, average throughput of 172 Gbps, and an energy efficiency of 2.8 pJ/bit with a 1.0 V supply voltage; at 0.8 V, it achieves throughput of 116 Gbps and energy efficiency of 1.7 pJ/bit.
Bayesian and grAphical Models for Biomedical Imaging
M. Jorge Cardoso
Ivor J. A. Simpson
Arbel, Tal
Annemie Ribbens
Machine Learning and Interpretation in Neuroimaging
Georg Langs
Guillermo Cecchi
Kai-min Kevin Chang
Brian G Murphy
Experimental Algorithms
Samuel Rosat
Issmail ElHallaoui
François Soumis
Andrea Lodi
Adaptive Multiset Stochastic Decoding of Non-Binary LDPC Codes
Alexandru Ciobanu
Saied Hemati
Warren J. Gross
We propose a non-binary stochastic decoding algorithm for low-density parity-check (LDPC) codes over GF(q) with degree two variable nodes, c… (voir plus)alled Adaptive Multiset Stochastic Algorithm (AMSA). The algorithm uses multisets, an extension of sets that allows multiple occurrences of an element, to represent probability mass functions that simplifies the structure of the variable nodes. The run-time complexity of one decoding cycle using AMSA is O(q) for conventional memory architectures, and O(1) if a custom memory architecture is used. Two fully-parallel AMSA decoders are implemented on FPGA for two (192,96) (2,4)-regular codes over GF(64) and GF(256), both achieving a maximum clock frequency of 108 MHz. The GF(64) decoder has a coded throughput of 65 Mb/s at Eb/N0=2.4 dB when using conventional memory, while a decoder using the custom memory version can achieve 698 Mb/s at the same Eb/N0. At a frame error rate (FER) of 2×10-6 the GF(64) version of the algorithm is only 0.04 dB away from the floating-point SPA performance, and for the GF(256) code the difference is 0.2 dB. To the best of our knowledge, this is the first fully parallel non-binary LDPC decoder over GF(256) reported in the literature.
Multiscale Gossip for Efficient Decentralized Averaging in Wireless Packet Networks
Konstantinos I. Tsianos
Michael G. Rabbat
This paper describes and analyzes a hierarchical algorithm called Multiscale Gossip for solving the distributed average consensus problem in… (voir plus) wireless sensor networks. The algorithm proceeds by recursively partitioning a given network. Initially, nodes at the finest scale gossip to compute local averages. Then, using multi-hop communication and geographic routing to communicate between nodes that are not directly connected, these local averages are progressively fused up the hierarchy until the global average is computed. We show that the proposed hierarchical scheme with k=Θ(loglogn) levels of hierarchy is competitive with state-of-the-art randomized gossip algorithms in terms of message complexity, achieving ε-accuracy with high probability after O(n loglogn log[1/(ε)] ) single-hop messages. Key to our analysis is the way in which the network is recursively partitioned. We find that the above scaling law is achieved when subnetworks at scale j contain O(n(2/3)j) nodes; then the message complexity at any individual scale is O(n log[1/ε]). Another important consequence of the hierarchical construction is that the longest distance over which messages are exchanged is O(n1/3) hops (at the highest scale), and most messages (at lower scales) travel shorter distances. In networks that use link-level acknowledgements, this results in less congestion and resource usage by reducing message retransmissions. Simulations illustrate that the proposed scheme is more efficient than state-of-the-art randomized gossip algorithms based on averaging along paths.