We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Machine Learning and Interpretation in Neuroimaging
We propose a non-binary stochastic decoding algorithm for low-density parity-check (LDPC) codes over GF(q) with degree two variable nodes, c… (see more)alled Adaptive Multiset Stochastic Algorithm (AMSA). The algorithm uses multisets, an extension of sets that allows multiple occurrences of an element, to represent probability mass functions that simplifies the structure of the variable nodes. The run-time complexity of one decoding cycle using AMSA is O(q) for conventional memory architectures, and O(1) if a custom memory architecture is used. Two fully-parallel AMSA decoders are implemented on FPGA for two (192,96) (2,4)-regular codes over GF(64) and GF(256), both achieving a maximum clock frequency of 108 MHz. The GF(64) decoder has a coded throughput of 65 Mb/s at Eb/N0=2.4 dB when using conventional memory, while a decoder using the custom memory version can achieve 698 Mb/s at the same Eb/N0. At a frame error rate (FER) of 2×10-6 the GF(64) version of the algorithm is only 0.04 dB away from the floating-point SPA performance, and for the GF(256) code the difference is 0.2 dB. To the best of our knowledge, this is the first fully parallel non-binary LDPC decoder over GF(256) reported in the literature.
2013-08-15
IEEE Transactions on Signal Processing (published)
We propose a non-binary stochastic decoding algorithm for low-density parity-check (LDPC) codes over GF(q) with degree two variable nodes, c… (see more)alled Adaptive Multiset Stochastic Algorithm (AMSA). The algorithm uses multisets, an extension of sets that allows multiple occurrences of an element, to represent probability mass functions that simplifies the structure of the variable nodes. The run-time complexity of one decoding cycle using AMSA is O(q) for conventional memory architectures, and O(1) if a custom memory architecture is used. Two fully-parallel AMSA decoders are implemented on FPGA for two (192,96) (2,4)-regular codes over GF(64) and GF(256), both achieving a maximum clock frequency of 108 MHz. The GF(64) decoder has a coded throughput of 65 Mb/s at Eb/N0=2.4 dB when using conventional memory, while a decoder using the custom memory version can achieve 698 Mb/s at the same Eb/N0. At a frame error rate (FER) of 2×10-6 the GF(64) version of the algorithm is only 0.04 dB away from the floating-point SPA performance, and for the GF(256) code the difference is 0.2 dB. To the best of our knowledge, this is the first fully parallel non-binary LDPC decoder over GF(256) reported in the literature.
2013-08-01
IEEE Transactions on Signal Processing (published)
Polar codes are the first error-correcting codes to provably achieve channel capacity, asymptotically in code length, with an explicit const… (see more)ruction. However, under successive-cancellation decoding, polar codes require very long code lengths to compete with existing modern codes. Nonetheless, the successive cancellation algorithm enables very-low-complexity implementations in hardware, due to the regular structure exhibited by polar codes. In this paper, we present an improved architecture for successive-cancellation decoding of polar codes, making use of a novel semi-parallel, encoder-based partial-sum computation module. We also provide quantization results for realistic code length N=215, and explore various optimization techniques such as a chained processing element and a variable quantization scheme. This design is shown to scale to code lengths of up to N=221, enabled by its low logic use, low register use and simple datapaths, limited almost exclusively by the amount of available SRAM. It also supports an overlapped loading of frames, allowing full-throughput decoding with a single set of input buffers.
This paper describes and analyzes a hierarchical algorithm called Multiscale Gossip for solving the distributed average consensus problem in… (see more) wireless sensor networks. The algorithm proceeds by recursively partitioning a given network. Initially, nodes at the finest scale gossip to compute local averages. Then, using multi-hop communication and geographic routing to communicate between nodes that are not directly connected, these local averages are progressively fused up the hierarchy until the global average is computed. We show that the proposed hierarchical scheme with k=Θ(loglogn) levels of hierarchy is competitive with state-of-the-art randomized gossip algorithms in terms of message complexity, achieving ε-accuracy with high probability after O(n loglogn log[1/(ε)] ) single-hop messages. Key to our analysis is the way in which the network is recursively partitioned. We find that the above scaling law is achieved when subnetworks at scale j contain O(n(2/3)j) nodes; then the message complexity at any individual scale is O(n log[1/ε]). Another important consequence of the hierarchical construction is that the longest distance over which messages are exchanged is O(n1/3) hops (at the highest scale), and most messages (at lower scales) travel shorter distances. In networks that use link-level acknowledgements, this results in less congestion and resource usage by reducing message retransmissions. Simulations illustrate that the proposed scheme is more efficient than state-of-the-art randomized gossip algorithms based on averaging along paths.
2013-05-01
IEEE Transactions on Signal Processing (published)
We consider the problem of inferring the topology of an M-by-N network by sending probes between M sources and N receivers. Prior work has s… (see more)hown that this problem can be decomposed into two parts: first, infer smaller subnetwork components (i.e., 1-by-N's or 2-by-2's) and then merge these components to identify the M-by-N topology. In this paper, we focus on the second part. In particular, we assume that a 1by-N topology is given and that all 2-by-2 components can be queried and learned using end-to-end probes. The problem is which 2-by-2's to query and how to merge them with the 1-byN, so as to exactly identify the 2-by-N topology, and optimize a number of performance metrics including measurement traffic, time complexity, and memory usage. We provide a lower bound, ⌈N/2⌉, on the number of 2-by-2's required by any active learning algorithm and we also propose a greedy algorithm that is nearoptimal and efficient in practice. It follows a bottom-up approach: at every step, it selects two receivers, queries the corresponding 2-by-2, and merges it with the given 1-by-N. The algorithm requires exactly N - 1 steps, which is much less than all (N:2) possible 2-by-2's, and it correctly identifies the 2-by-N topology.
2013-03-20
Annual Conference on Information Sciences and Systems (published)
We consider the problem of inferring the topology of an M-by-N network by sending probes between M sources and N receivers. Prior work has s… (see more)hown that this problem can be decomposed into two parts: first, infer smaller subnetwork components (i.e., 1-by-N's or 2-by-2's) and then merge these components to identify the M-by-N topology. In this paper, we focus on the second part. In particular, we assume that a 1by-N topology is given and that all 2-by-2 components can be queried and learned using end-to-end probes. The problem is which 2-by-2's to query and how to merge them with the 1-byN, so as to exactly identify the 2-by-N topology, and optimize a number of performance metrics including measurement traffic, time complexity, and memory usage. We provide a lower bound, ⌈N/2⌉, on the number of 2-by-2's required by any active learning algorithm and we also propose a greedy algorithm that is nearoptimal and efficient in practice. It follows a bottom-up approach: at every step, it selects two receivers, queries the corresponding 2-by-2, and merges it with the given 1-by-N. The algorithm requires exactly N - 1 steps, which is much less than all (N:2) possible 2-by-2's, and it correctly identifies the 2-by-N topology.
2013-03-20
Annual Conference on Information Sciences and Systems (published)
Polar codes are a recently discovered family of capacity-achieving codes that are seen as a major breakthrough in coding theory. Motivated b… (see more)y the recent rapid progress in the theory of polar codes, we propose a semi-parallel architecture for the implementation of successive cancellation decoding. We take advantage of the recursive structure of polar codes to make efficient use of processing resources. The derived architecture has a very low processing complexity while the memory complexity remains similar to that of previous architectures. This drastic reduction in processing complexity allows very large polar code decoders to be implemented in hardware. An N=217 polar code successive cancellation decoder is implemented in an FPGA. We also report synthesis results for ASIC.
2013-01-01
IEEE Transactions on Signal Processing (published)
Polar codes are a recently discovered family of capacity-achieving codes that are seen as a major breakthrough in coding theory. Motivated b… (see more)y the recent rapid progress in the theory of polar codes, we propose a semi-parallel architecture for the implementation of successive cancellation decoding. We take advantage of the recursive structure of polar codes to make efficient use of processing resources. The derived architecture has a very low processing complexity while the memory complexity remains similar to that of previous architectures. This drastic reduction in processing complexity allows very large polar code decoders to be implemented in hardware. An N=217 polar code successive cancellation decoder is implemented in an FPGA. We also report synthesis results for ASIC.
2013-01-01
IEEE Transactions on Signal Processing (published)