QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory
Yu-Chao Hsu, Jiun-Cheng Jiang, Chun-Hua Lin, Kuo-Chung Peng, Nan-Yow Chen, Samuel Yen-Chi Chen, En-Jui Kuo, Hsi-Sheng Goan
TL;DR
The paper tackles the inefficiency and limited expressivity of conventional LSTMs in time-series tasks by introducing QKAN-LSTM, which embeds single-qubit, data-reuploading quantum activations (DARUAN) into LSTM gates to achieve spectral enrichment without entanglement. By replacing affine gate mappings with sums of quantum subfunctions, the model gains enhanced frequency adaptability while remaining simulatable on classical hardware; the framework is further extended to HQKAN-LSTM via the JHCG Net to enable hierarchical latent processing. Empirical results on Damped SHM, Bessel, and Urban Telecommunication datasets show substantial parameter reductions and improved predictive performance compared to LSTM and QLSTM baselines, with HQKAN-LSTM often delivering the best overall results. The approach provides a scalable, interpretable path toward quantum-inspired sequential modeling suitable for real-world, resource-constrained environments and large-scale architectures like Transformers and Diffusion Models.
Abstract
Long short-term memory (LSTM) models are a particular type of recurrent neural networks (RNNs) that are central to sequential modeling tasks in domains such as urban telecommunication forecasting, where temporal correlations and nonlinear dependencies dominate. However, conventional LSTMs suffer from high parameter redundancy and limited nonlinear expressivity. In this work, we propose the Quantum-inspired Kolmogorov-Arnold Long Short-Term Memory (QKAN-LSTM), which integrates Data Re-Uploading Activation (DARUAN) modules into the gating structure of LSTMs. Each DARUAN acts as a quantum variational activation function (QVAF), enhancing frequency adaptability and enabling an exponentially enriched spectral representation without multi-qubit entanglement. The resulting architecture preserves quantum-level expressivity while remaining fully executable on classical hardware. Empirical evaluations on three datasets, Damped Simple Harmonic Motion, Bessel Function, and Urban Telecommunication, demonstrate that QKAN-LSTM achieves superior predictive accuracy and generalization with a 79% reduction in trainable parameters compared to classical LSTMs. We extend the framework to the Jiang-Huang-Chen-Goan Network (JHCG Net), which generalizes KAN to encoder-decoder structures, and then further use QKAN to realize the latent KAN, thereby creating a Hybrid QKAN (HQKAN) for hierarchical representation learning. The proposed HQKAN-LSTM thus provides a scalable and interpretable pathway toward quantum-inspired sequential modeling in real-world data environments.
