Behnoush Khavari

Parity Requires Unified Input Dependence and Negative Eigenvalues in SSMs

Jayesh Khullar

Franccois Rivest

Recent work has shown that LRNN models such as S4D, Mamba, and DeltaNet lack state-tracking capability due to either time-invariant transiti… (see more)on matrices or restricted eigenvalue ranges. To address this, input-dependent transition matrices, particularly those that are complex or non-triangular, have been proposed to enhance SSM performance on such tasks. While existing theorems demonstrate that both input-independent and non-negative SSMs are incapable of solving simple state-tracking tasks, such as parity, regardless of depth, they do not explore whether combining these two types in a multilayer SSM could help. We investigate this question for efficient SSMs with diagonal transition matrices and show that such combinations still fail to solve parity. This implies that a recurrence layer must both be input-dependent and include negative eigenvalues. Our experiments support this conclusion by analyzing an SSM model that combines S4D and Mamba layers.

2025-08-10

ArXiv (preprint)

arxiv.org

Parity Requires Unified Input Dependence and Negative Eigenvalues in SSMs

Jayesh Khullar

Franccois Rivest

2025-06-10

ICML.cc/2025/Workshop/MOSS (published)

doi.org

openreview.net

Lower and Upper Bounds on the Pseudo-Dimension of Tensor Network Models

Behnoush Khavari

Guillaume Rabusseau

Tensor network (TN) methods have been a key ingredient of advances in condensed matter physics and have recently sparked interest in the mac… (see more)hine learning community for their ability to compactly represent very high-dimensional objects. TN methods can for example be used to efﬁciently learn linear models in exponentially large feature spaces [56]. In this work, we derive upper and lower bounds on the VC-dimension and pseudo-dimension of a large class of TN models for classiﬁcation, regression and completion. Our upper bounds hold for linear models parameterized by arbitrary TN structures, and we derive lower bounds for common tensor decomposition models (CP, Tensor Train, Tensor Ring and Tucker) showing the tightness of our general upper bound. These results are used to derive a generalization bound which can be applied to classiﬁcation with low-rank matrices as well as linear classiﬁers based on any of the commonly used tensor decomposition models. As a corollary of our results, we obtain a bound on the VC-dimension of the matrix product state classiﬁer introduced in [56] as a function of the so-called bond dimension (i.e. tensor train rank), which answers an open problem listed by Cirac, Garre-Rubio and Pérez-García in [13].

openreview.net

Hackathon | Building safer AI for youth mental health

Indigenous Pathfinders in AI

AI Advantage

Behnoush Khavari

Publications

Hackathon | Building safer AI for youth mental health

Indigenous Pathfinders in AI

AI Advantage

Popular keywords:

Behnoush Khavari

Publications