Naga: Vedic Encoding for Deep State Space Models
Melanie Schaller, Nick Janssen, Bodo Rosenhahn
TL;DR
Naga introduces a Vedic-mathematics–inspired encoding for deep state-space models to capture cross-time interactions in long-horizon forecasting. By jointly encoding forward and time-reversed sequences and combining them with element-wise Hadamard interactions, Naga provides a bilinear feature space that yields improved gradient flow and representational capacity over linear encoders. Theoretical results (Lemmas 1–2 and corollaries) formalize the expanded expressivity and optimized gradient propagation, while extensive experiments across seven LTSF benchmarks show state-of-the-art MSE/MAE with ~2.1M parameters and favorable efficiency. Ablation studies confirm that the Vedic encoding, bidirectionality, and depth jointly drive gains, and external ablations illustrate robust performance relative to SOTA Mamba variants. The work highlights a promising direction for structured, interpretable learning in sequential domains by bridging symbolic computation and deep learning.
Abstract
This paper presents Naga, a deep State Space Model (SSM) encoding approach inspired by structural concepts from Vedic mathematics. The proposed method introduces a bidirectional representation for time series by jointly processing forward and time-reversed input sequences. These representations are then combined through an element-wise (Hadamard) interaction, resulting in a Vedic-inspired encoding that enhances the model's ability to capture temporal dependencies across distant time steps. We evaluate Naga on multiple long-term time series forecasting (LTSF) benchmarks, including ETTh1, ETTh2, ETTm1, ETTm2, Weather, Traffic, and ILI. The experimental results show that Naga outperforms 28 current state of the art models and demonstrates improved efficiency compared to existing deep SSM-based approaches. The findings suggest that incorporating structured, Vedic-inspired decomposition can provide an interpretable and computationally efficient alternative for long-range sequence modeling.
