Publications

Stochastic Decoding of Linear Block Codes With High-Density Parity-Check Matrices

S. Tehrani

Christophe Jego

Bo Zhu

This correspondence extends the application of the recently proposed stochastic decoding approach to decode linear block codes with high-den… (see more)sity parity-check matrices and discusses its hardware complexity. Results demonstrate decoding performance close to floating-point iterative soft-input soft-output (SISO) decoding while offering nodes with considerably lower complexity compared to fixed-point SISO decoding.

2008-11-01

IEEE Transactions on Signal Processing (published)

doi.org

Distributed Average Consensus With Dithered Quantization

T. C. Aysal

Mark Coates

Michael Rabbat

In this paper, we develop algorithms for distributed computation of averages of the node data over networks with bandwidth/power constraints… (see more) or large volumes of data. Distributed averaging algorithms fail to achieve consensus when deterministic uniform quantization is adopted. We propose a distributed algorithm in which the nodes utilize probabilistically quantized information, i.e., dithered quantization, to communicate with each other. The algorithm we develop is a dynamical system that generates sequences achieving a consensus at one of the quantization values almost surely. In addition, we show that the expected value of the consensus is equal to the average of the original sensor data. We derive an upper bound on the mean-square-error performance of the probabilistically quantized distributed averaging (PQDA). Moreover, we show that the convergence of the PQDA is monotonic by studying the evolution of the minimum-length interval containing the node values. We reveal that the length of this interval is a monotonically nonincreasing function with limit zero. We also demonstrate that all the node values, in the worst case, converge to the final two quantization bins at the same rate as standard unquantized consensus. Finally, we report the results of simulations conducted to evaluate the behavior and the effectiveness of the proposed algorithm in various scenarios.

2008-10-01

IEEE Transactions on Signal Processing (published)

doi.org

Distributed Average Consensus With Dithered Quantization

Tuncer Can Aysal

Mark Coates

Michael Rabbat

In this paper, we develop algorithms for distributed computation of averages of the node data over networks with bandwidth/power constraints… (see more) or large volumes of data. Distributed averaging algorithms fail to achieve consensus when deterministic uniform quantization is adopted. We propose a distributed algorithm in which the nodes utilize probabilistically quantized information, i.e., dithered quantization, to communicate with each other. The algorithm we develop is a dynamical system that generates sequences achieving a consensus at one of the quantization values almost surely. In addition, we show that the expected value of the consensus is equal to the average of the original sensor data. We derive an upper bound on the mean-square-error performance of the probabilistically quantized distributed averaging (PQDA). Moreover, we show that the convergence of the PQDA is monotonic by studying the evolution of the minimum-length interval containing the node values. We reveal that the length of this interval is a monotonically nonincreasing function with limit zero. We also demonstrate that all the node values, in the worst case, converge to the final two quantization bins at the same rate as standard unquantized consensus. Finally, we report the results of simulations conducted to evaluate the behavior and the effectiveness of the proposed algorithm in various scenarios.

2008-10-01

IEEE Transactions on Signal Processing (published)

doi.org

Greedy Gossip With Eavesdropping

Deniz Ustebay

Boris Oreshkin

Mark Coates

Michael Rabbat

This paper presents greedy gossip with eavesdropping (GGE), a novel randomized gossip algorithm for distributed computation of the average c… (see more)onsensus problem. In gossip algorithms, nodes in the network randomly communicate with their neighbors and exchange information iteratively. The algorithms are simple and decentralized, making them attractive for wireless network applications. In general, gossip algorithms are robust to unreliable wireless conditions and time varying network topologies. In this paper, we introduce GGE and demonstrate that greedy updates lead to rapid convergence. We do not require nodes to have any location information. Instead, greedy updates are made possible by exploiting the broadcast nature of wireless communications. During the operation of GGE, when a node decides to gossip, instead of choosing one of its neighbors at random, it makes a greedy selection, choosing the node which has the value most different from its own. In order to make this selection, nodes need to know their neighbors' values. Therefore, we assume that all transmissions are wireless broadcasts and nodes keep track of their neighbors' values by eavesdropping on their communications. We show that the convergence of GGE is guaranteed for connected network topologies. We also study the rates of convergence and illustrate, through theoretical bounds and numerical simulations, that GGE consistently outperforms randomized gossip and performs comparably to geographic gossip on moderate-sized random geometric graph topologies.

2008-05-07

ArXiv (preprint)

doi.org

arxiv.org

Recent Advances in Reinforcement Learning

Joelle Pineau

2008-01-01

Lecture Notes in Computer Science (published)

doi.org

Recent Advances in Reinforcement Learning

Joelle Pineau

2008-01-01

Lecture Notes in Computer Science (published)

doi.org

Advances in Information Retrieval

Diane Kelly

Fernando Diaz

Nicholas J. Belkin

James Allan

2004-04-05

Lecture Notes in Computer Science (published)

doi.org

Advances in Information Retrieval

Diane Kelly

Fernando Diaz

Nicholas J. Belkin

James Allan

2004-01-01

ECIR (published)

doi.org

L AUGHING H YENA D ISTILLERY Extracting Compact Recurrences From Convolutions

∗. StefanoMassaroli

∗. MichaelPoli

∗. DanielY.Fu

Hermann Kumbong

Rom N. Parnichkun

Aman Timalsina

David W. Romero

Quinn McIntyre

Beidi Chen

Atri Rudra

Ce Zhang

Christopher Re

Stefano Ermon

Yoshua Bengio

Recent advances in attention-free sequence models rely on convolutions as alternatives to the attention operator at the core of Transformers… (see more). In particular, long convolution sequence models have achieved state-of-the-art performance in many domains, but incur a significant cost during auto-regressive inference workloads – naively requiring a full pass (or caching of activations) over the input sequence for each generated token – similarly to attention-based models. In this paper, we seek to enable O (1) compute and memory cost per token in any pre-trained long convolution architecture to reduce memory footprint and increase throughput during generation. Concretely, our methods consist in extracting low-dimensional linear state-space models from each convolution layer, building upon rational interpolation and model-order reduction techniques. We further introduce architectural improvements to convolution-based layers such as Hyena : by weight-tying the filters across channels into heads , we achieve higher pre-training quality and reduce the number of filters to be distilled. The resulting model achieves 10 × higher throughput than Transformers and 1 . 5 × higher than Hyena at 1 . 3 B parameters, without any loss in quality after distillation.

2000-01-01

(published)

www.semanticscholar.org

Cognitive Models as Simulators: Using Cognitive Models to Tap into Implicit Human Feedback

Ardavan S Nobandegani

Thomas R. Shultz

Irina Rish

In this work, we substantiate the idea of cognitive models as simulators , which is to have AI systems interact with, and collect feedback f… (see more)rom, cognitive models instead of humans, thereby making the training process safer, cheaper, and faster. We leverage this idea in the context of learning a fair behavior toward a counterpart exhibiting various emotional states — as implicit human feedback. As a case study, we adopt the Ultima-tum game (UG), a canonical task in behavioral and brain sciences for studying fairness. We show that our reinforcement learning (RL) agents learn to exhibit differential, rationally-justified behaviors under various emotional states of their UG counterpart. We discuss the implications of our work for AI and cognitive science research, and its potential for interactive learning with implicit human feedback.

2000-01-01

(published)

www.semanticscholar.org

Deep PDE Solvers for Subgrid Modelling and Out-of-Distribution Generalization

Patrick Chatain

Adam M. Oberman

Climate and weather modelling (CWM) is an important area where ML models are used for subgrid modelling: making predictions of processes occ… (see more)urring at scales too small to be resolved by standard solution methods(Brasseur & Jacob, 2017). These models are expected to make accurate predictions, even on out-of-distribution (OOD) data, and are additionally expected to respect important physical constraints of the ground truth model (Kashinath et al., 2021). While many specialized ML PDE solvers have been developed, the particular requirements of CWM models have not been addressed so far. The goal of this work is to address them. We propose and develop a novel architecture, which matches or exceeds the performance of standard ML models, and which demonstrably succeeds in OOD generalization. The architecture is based on expert knowledge of the structure of PDE solution operators, which permits the model to also obey important physical constraints

2000-01-01

(published)

www.semanticscholar.org

Learning Optimizers for Local SGD

Charles-Étienne Joseph

Benjamin Thérien

Abhinav Moudgil

Boris Knyazev

Eugene Belilovsky

Communication-efficient variants of SGD, specifically local SGD, have received a great deal of interest in recent years. These approaches co… (see more)mpute multiple gradient steps locally, that is on each worker, before averaging model parameters, helping relieve the critical communication bottleneck in distributed deep learning training. Although many variants of these approaches have been proposed, they can sometimes lag behind state-of-the-art optimizers for deep learning. In this work, we incorporate local optimizers that compute multiple updates into a learned optimization framework, allowing to meta-learn potentially more efficient local SGD algorithms. Our results demonstrate that local learned optimizers can substantially outperform local SGD and its sophisticated variants while maintaining their communication efficiency. We show that the learned optimizers can generalize to new datasets and architectures, demonstrating the potential of learned optimizers for improving communication-efficient distributed learning.

2000-01-01

(published)

www.semanticscholar.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications