Table of Contents
Fetching ...

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

Yu-Chao Hsu, Jiun-Cheng Jiang, Chun-Hua Lin, Kuo-Chung Peng, Nan-Yow Chen, Samuel Yen-Chi Chen, En-Jui Kuo, Hsi-Sheng Goan

TL;DR

The paper tackles the inefficiency and limited expressivity of conventional LSTMs in time-series tasks by introducing QKAN-LSTM, which embeds single-qubit, data-reuploading quantum activations (DARUAN) into LSTM gates to achieve spectral enrichment without entanglement. By replacing affine gate mappings with sums of quantum subfunctions, the model gains enhanced frequency adaptability while remaining simulatable on classical hardware; the framework is further extended to HQKAN-LSTM via the JHCG Net to enable hierarchical latent processing. Empirical results on Damped SHM, Bessel, and Urban Telecommunication datasets show substantial parameter reductions and improved predictive performance compared to LSTM and QLSTM baselines, with HQKAN-LSTM often delivering the best overall results. The approach provides a scalable, interpretable path toward quantum-inspired sequential modeling suitable for real-world, resource-constrained environments and large-scale architectures like Transformers and Diffusion Models.

Abstract

Long short-term memory (LSTM) models are a particular type of recurrent neural networks (RNNs) that are central to sequential modeling tasks in domains such as urban telecommunication forecasting, where temporal correlations and nonlinear dependencies dominate. However, conventional LSTMs suffer from high parameter redundancy and limited nonlinear expressivity. In this work, we propose the Quantum-inspired Kolmogorov-Arnold Long Short-Term Memory (QKAN-LSTM), which integrates Data Re-Uploading Activation (DARUAN) modules into the gating structure of LSTMs. Each DARUAN acts as a quantum variational activation function (QVAF), enhancing frequency adaptability and enabling an exponentially enriched spectral representation without multi-qubit entanglement. The resulting architecture preserves quantum-level expressivity while remaining fully executable on classical hardware. Empirical evaluations on three datasets, Damped Simple Harmonic Motion, Bessel Function, and Urban Telecommunication, demonstrate that QKAN-LSTM achieves superior predictive accuracy and generalization with a 79% reduction in trainable parameters compared to classical LSTMs. We extend the framework to the Jiang-Huang-Chen-Goan Network (JHCG Net), which generalizes KAN to encoder-decoder structures, and then further use QKAN to realize the latent KAN, thereby creating a Hybrid QKAN (HQKAN) for hierarchical representation learning. The proposed HQKAN-LSTM thus provides a scalable and interpretable pathway toward quantum-inspired sequential modeling in real-world data environments.

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

TL;DR

The paper tackles the inefficiency and limited expressivity of conventional LSTMs in time-series tasks by introducing QKAN-LSTM, which embeds single-qubit, data-reuploading quantum activations (DARUAN) into LSTM gates to achieve spectral enrichment without entanglement. By replacing affine gate mappings with sums of quantum subfunctions, the model gains enhanced frequency adaptability while remaining simulatable on classical hardware; the framework is further extended to HQKAN-LSTM via the JHCG Net to enable hierarchical latent processing. Empirical results on Damped SHM, Bessel, and Urban Telecommunication datasets show substantial parameter reductions and improved predictive performance compared to LSTM and QLSTM baselines, with HQKAN-LSTM often delivering the best overall results. The approach provides a scalable, interpretable path toward quantum-inspired sequential modeling suitable for real-world, resource-constrained environments and large-scale architectures like Transformers and Diffusion Models.

Abstract

Long short-term memory (LSTM) models are a particular type of recurrent neural networks (RNNs) that are central to sequential modeling tasks in domains such as urban telecommunication forecasting, where temporal correlations and nonlinear dependencies dominate. However, conventional LSTMs suffer from high parameter redundancy and limited nonlinear expressivity. In this work, we propose the Quantum-inspired Kolmogorov-Arnold Long Short-Term Memory (QKAN-LSTM), which integrates Data Re-Uploading Activation (DARUAN) modules into the gating structure of LSTMs. Each DARUAN acts as a quantum variational activation function (QVAF), enhancing frequency adaptability and enabling an exponentially enriched spectral representation without multi-qubit entanglement. The resulting architecture preserves quantum-level expressivity while remaining fully executable on classical hardware. Empirical evaluations on three datasets, Damped Simple Harmonic Motion, Bessel Function, and Urban Telecommunication, demonstrate that QKAN-LSTM achieves superior predictive accuracy and generalization with a 79% reduction in trainable parameters compared to classical LSTMs. We extend the framework to the Jiang-Huang-Chen-Goan Network (JHCG Net), which generalizes KAN to encoder-decoder structures, and then further use QKAN to realize the latent KAN, thereby creating a Hybrid QKAN (HQKAN) for hierarchical representation learning. The proposed HQKAN-LSTM thus provides a scalable and interpretable pathway toward quantum-inspired sequential modeling in real-world data environments.

Paper Structure

This paper contains 21 sections, 13 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Overview of the QKAN-LSTM architecture.(a) The architecture of the QKAN-LSTM model with QKAN integration in the input, forget, cell, and output gates. (b) The data is fed into the DARUAN layer, where the quantum features are re-uploaded and processed. (c) A detailed view of how the DARUANs are applied to the gates in QKAN to enhance the LSTM's ability to capture complex and non-linear sequence dependencies with QVAF.
  • Figure 2: Architecture of the Jiang–Huang–Chen–Goan Network (JHCG Net) jiang2025qkan. The JHCG Net comprises a fully connected encoder and decoder with a Kolmogorov–Arnold Network (KAN) serving as the latent feature processor, forming an autoencoder-like architecture. When the latent KAN module is implemented using Quantum Kolmogorov–Arnold Networks (QKANs), the framework is referred to as the Hybrid QKAN (HQKAN), integrating quantum-inspired nonlinear transformations within the latent representation space.
  • Figure 3: Results of QKAN-LSTM on damped SHM and Bessel function datasets.