Table of Contents
Fetching ...

Nonlinear Factor Decomposition via Kolmogorov-Arnold Networks: A Spectral Approach to Asset Return Analysis

David Breazu

Abstract

KAN-PCA is an autoencoder that uses a KAN as encoder and a linear map as decoder. It generalizes classical PCA by replacing linear projections with learned B-spline functions on each edge. The motivation is to capture more variance than classical PCA, which becomes inefficient during market crises when the linear assumption breaks down and correlations between assets change dramatically. We prove that if the spline activations are forced to be linear, KAN-PCA yields exactly the same results as classical PCA, establishing PCA as a special case. Experiments on 20 S&P 500 stocks (2015-2024) show that KAN-PCA achieves a reconstruction R^2 of 66.57%, compared to 62.99% for classical PCA with the same 3 factors, while matching PCA out-of-sample after correcting for data leakage in the training procedure.

Nonlinear Factor Decomposition via Kolmogorov-Arnold Networks: A Spectral Approach to Asset Return Analysis

Abstract

KAN-PCA is an autoencoder that uses a KAN as encoder and a linear map as decoder. It generalizes classical PCA by replacing linear projections with learned B-spline functions on each edge. The motivation is to capture more variance than classical PCA, which becomes inefficient during market crises when the linear assumption breaks down and correlations between assets change dramatically. We prove that if the spline activations are forced to be linear, KAN-PCA yields exactly the same results as classical PCA, establishing PCA as a special case. Experiments on 20 S&P 500 stocks (2015-2024) show that KAN-PCA achieves a reconstruction R^2 of 66.57%, compared to 62.99% for classical PCA with the same 3 factors, while matching PCA out-of-sample after correcting for data leakage in the training procedure.

Paper Structure

This paper contains 15 sections, 5 theorems, 18 equations, 2 figures, 1 table.

Key Result

Lemma 1

A spline function $\varphi \in \mathcal{S}_k(G)$ whose second derivative vanishes identically must be of the form $\varphi(x) = wx$ for some $w \in \mathbb{R}$.

Figures (2)

  • Figure 1: KAN nonlinear factors (2015--2024). The red dashed line marks the COVID-19 crash of March 2020. All three factors exhibit sharp spikes at this date, demonstrating that the model treats the crash as qualitatively distinct from normal market movements.
  • Figure 2: Learned edge functions $\varphi(x)$ for all 20 stocks (contribution to Factor 1). If relationships were linear as PCA assumes, all curves would be straight lines. They are not, providing direct visual evidence of nonlinear factor structure.

Theorems & Definitions (12)

  • Definition 1: B-Spline Space
  • Definition 2: KAN Layer
  • Definition 3: KAN-PCA
  • Lemma 1: Linear Splines are Linear Maps
  • proof
  • Lemma 2: Linear KAN Layer is Matrix Multiplication
  • proof
  • Lemma 3: Optimal Linear Autoencoder $=$ PCA, baldi1989
  • proof
  • Theorem 4: KAN-PCA Reduces to PCA in the Linear Limit
  • ...and 2 more