Table of Contents
Fetching ...

TSKANMixer: Kolmogorov-Arnold Networks with MLP-Mixer Model for Time Series Forecasting

Young-Chae Hong, Bei Xiao, Yangho Chen

TL;DR

Addressing the limitations of Transformer-heavy approaches, the paper introduces TSKANMixer, a hybrid that embeds Kolmogorov-Arnold Networks into the Time-Series Mixer to better capture non-linear temporal and cross-variate patterns. It presents two variants: replacing the temporal projection with a KAN (v01) and adding a KAN-based time-mixing layer before projection (v02), both leveraging a two-depth KAN. Empirical results across ten datasets with $L=512$ and $H=96$ show that TSKANMixer often outperforms TSMixer and ranks among the top methods, notably on ETTh2, though training time is substantially longer due to KAN computations. The findings support KANs as a promising direction for time-series forecasting, motivating future work on efficiency, scalability, and interpretability.

Abstract

Time series forecasting has long been a focus of research across diverse fields, including economics, energy, healthcare, and traffic management. Recent works have introduced innovative architectures for time series models, such as the Time-Series Mixer (TSMixer), which leverages multi-layer perceptrons (MLPs) to enhance prediction accuracy by effectively capturing both spatial and temporal dependencies within the data. In this paper, we investigate the capabilities of the Kolmogorov-Arnold Networks (KANs) for time-series forecasting by modifying TSMixer with a KAN layer (TSKANMixer). Experimental results demonstrate that TSKANMixer tends to improve prediction accuracy over the original TSMixer across multiple datasets, ranking among the top-performing models compared to other time series approaches. Our results show that the KANs are promising alternatives to improve the performance of time series forecasting by replacing or extending traditional MLPs.

TSKANMixer: Kolmogorov-Arnold Networks with MLP-Mixer Model for Time Series Forecasting

TL;DR

Addressing the limitations of Transformer-heavy approaches, the paper introduces TSKANMixer, a hybrid that embeds Kolmogorov-Arnold Networks into the Time-Series Mixer to better capture non-linear temporal and cross-variate patterns. It presents two variants: replacing the temporal projection with a KAN (v01) and adding a KAN-based time-mixing layer before projection (v02), both leveraging a two-depth KAN. Empirical results across ten datasets with and show that TSKANMixer often outperforms TSMixer and ranks among the top methods, notably on ETTh2, though training time is substantially longer due to KAN computations. The findings support KANs as a promising direction for time-series forecasting, motivating future work on efficiency, scalability, and interpretability.

Abstract

Time series forecasting has long been a focus of research across diverse fields, including economics, energy, healthcare, and traffic management. Recent works have introduced innovative architectures for time series models, such as the Time-Series Mixer (TSMixer), which leverages multi-layer perceptrons (MLPs) to enhance prediction accuracy by effectively capturing both spatial and temporal dependencies within the data. In this paper, we investigate the capabilities of the Kolmogorov-Arnold Networks (KANs) for time-series forecasting by modifying TSMixer with a KAN layer (TSKANMixer). Experimental results demonstrate that TSKANMixer tends to improve prediction accuracy over the original TSMixer across multiple datasets, ranking among the top-performing models compared to other time series approaches. Our results show that the KANs are promising alternatives to improve the performance of time series forecasting by replacing or extending traditional MLPs.

Paper Structure

This paper contains 13 sections, 2 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: TSMixer for multivariate time series forecasting chen2023tsmixer
  • Figure 2: TSKANMixer Architectures
  • Figure 3: Training and validation over epochs on ETTh2
  • Figure 4: Visualization of predictions: true (green) and forecasted (red) values for the target (OT) feature