Mehran Shakerinava

Doctorat - McGill

Superviseur⋅e principal⋅e

Siamak Ravanbakhsh

Sujets de recherche

Apprentissage par renforcement

Apprentissage profond

Calcul parallèle

Optimisation

Symétrie

Théorie de l'apprentissage automatique

Site web

Google Scholar

GitHub

Publications

Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs

Mehran Shakerinava

Siamak Ravanbakhsh

Adam M. Oberman

Recent work has formalized the reward hypothesis through the lens of expected utility theory, by interpreting reward as utility. Hausner's f… (voir plus)oundational work showed that dropping the continuity axiom leads to a generalization of expected utility theory where utilities are lexicographically ordered vectors of arbitrary dimension. In this paper, we extend this result by identifying a simple and practical condition under which preferences cannot be represented by scalar rewards, necessitating a 2-dimensional reward function. We provide a full characterization of such reward functions, as well as the general d-dimensional case, in Markov Decision Processes (MDPs) under a memorylessness assumption on preferences. Furthermore, we show that optimal policies in this setting retain many desirable properties of their scalar-reward counterparts, while in the Constrained MDP (CMDP) setting -- another common multiobjective setting -- they do not.

2025-09-17

NeurIPS.cc/2025/Conference (spotlight)

doi.org

openreview.net

Parity Requires Unified Input Dependence and Negative Eigenvalues in SSMs

Jayesh Khullar

Franccois Rivest

Recent work has shown that LRNN models such as S4D, Mamba, and DeltaNet lack state-tracking capability due to either time-invariant transiti… (voir plus)on matrices or restricted eigenvalue ranges. To address this, input-dependent transition matrices, particularly those that are complex or non-triangular, have been proposed to enhance SSM performance on such tasks. While existing theorems demonstrate that both input-independent and non-negative SSMs are incapable of solving simple state-tracking tasks, such as parity, regardless of depth, they do not explore whether combining these two types in a multilayer SSM could help. We investigate this question for efficient SSMs with diagonal transition matrices and show that such combinations still fail to solve parity. This implies that a recurrence layer must both be input-dependent and include negative eigenvalues. Our experiments support this conclusion by analyzing an SSM model that combines S4D and Mamba layers.

2025-08-10

ArXiv (prépublication)

arxiv.org