Lag Operator SSMs: A Geometric Framework for Structured State Space Modeling
Sutashu Tomonaga, Kenji Doya, Noboru Murata
TL;DR
This work presents a direct discrete-time framework for Structured State Space Models built around a lag operator that geometrically tracks how a warped projection basis evolves between timesteps. By projecting the history onto time-warped orthonormal bases and updating via a backward lag, the authors derive the entire discrete recurrence from a single inner product, bypassing the traditional continuous-time ODE–discretization pipeline. They demonstrate that an exponential warp recovers the HiPPO-LegS system, providing a principled geometric foundation for HiPPO and enabling modular memory design through the warp function. Numerical experiments show exact HiPPO equivalence in matrix form and faithful replication of memory dynamics, validating the framework as a flexible, interpretable building block for long-range sequence modeling and potential multi-resolution memory schemes.
Abstract
Structured State Space Models (SSMs), which are at the heart of the recently popular Mamba architecture, are powerful tools for sequence modeling. However, their theoretical foundation relies on a complex, multi-stage process of continuous-time modeling and subsequent discretization, which can obscure intuition. We introduce a direct, first-principles framework for constructing discrete-time SSMs that is both flexible and modular. Our approach is based on a novel lag operator, which geometrically derives the discrete-time recurrence by measuring how the system's basis functions "slide" and change from one timestep to the next. The resulting state matrices are computed via a single inner product involving this operator, offering a modular design space for creating novel SSMs by flexibly combining different basis functions and time-warping schemes. To validate our approach, we demonstrate that a specific instance exactly recovers the recurrence of the influential HiPPO model. Numerical simulations confirm our derivation, providing new theoretical tools for designing flexible and robust sequence models.
