KAE: Kolmogorov-Arnold Auto-Encoder for Representation Learning

Fangchen Yu; Ruilizhen Hu; Yidong Lin; Yuqi Ma; Zhenghao Huang; Wenye Li

KAE: Kolmogorov-Arnold Auto-Encoder for Representation Learning

Fangchen Yu, Ruilizhen Hu, Yidong Lin, Yuqi Ma, Zhenghao Huang, Wenye Li

TL;DR

The Kolmogorov-Arnold Auto-Encoder (KAE), which integrates KAN with autoencoders (AEs) to enhance representation learning for retrieval, classification, and denoising tasks, and results suggest KAE's potential as a useful tool for representation learning.

Abstract

The Kolmogorov-Arnold Network (KAN) has recently gained attention as an alternative to traditional multi-layer perceptrons (MLPs), offering improved accuracy and interpretability by employing learnable activation functions on edges. In this paper, we introduce the Kolmogorov-Arnold Auto-Encoder (KAE), which integrates KAN with autoencoders (AEs) to enhance representation learning for retrieval, classification, and denoising tasks. Leveraging the flexible polynomial functions in KAN layers, KAE captures complex data patterns and non-linear relationships. Experiments on benchmark datasets demonstrate that KAE improves latent representation quality, reduces reconstruction errors, and achieves superior performance in downstream tasks such as retrieval, classification, and denoising, compared to standard autoencoders and other KAN variants. These results suggest KAE's potential as a useful tool for representation learning. Our code is available at \url{https://github.com/SciYu/KAE/}.

KAE: Kolmogorov-Arnold Auto-Encoder for Representation Learning

TL;DR

Abstract

Paper Structure (19 sections, 1 theorem, 6 equations, 4 figures, 5 tables)

This paper contains 19 sections, 1 theorem, 6 equations, 4 figures, 5 tables.

Introduction
Related Work
Autoencoders (AEs)
Kolmogorov-Arnold Networks (KANs)
Kolmogorov-Arnold Representation Theorem
KAN Network and Its Applications
Kolmogorov–Arnold Auto-Encoder
KANs in Autoencoders
Overall Architecture
Polynomial Activation Function
Evaluation
Experimental Setup
Reconstruction Quality
Applications
Similarity Search
...and 4 more sections

Key Result

Theorem 1

For any smooth function $f: [0,1]^d \rightarrow \mathbb{R}$, there exist continuous functions $\phi_{k,j}: [0,1] \rightarrow \mathbb{R}$ and $\Phi_k: \mathbb{R} \rightarrow \mathbb{R}$ such that:

Figures (4)

Figure 1: Model Comparison of AE, KAN, and KAE.
Figure 2: Recall@$\bm{N}$ of Similarity Search Across Datasets for Different Latent Dimensions.
Figure 3: Convergence Analysis of Test Loss Across Datasets for Latent Dimension $\bm{d_{\text{latent}}=16}$.
Figure 4: Model Capacity Analysis on the MNIST Dataset with Latent Dimension $\bm{d_{\text{latent}}=16}$. Bubble size represents the number of learnable parameters. For AE models, T = Tiny, S = Small, and B = Base.

Theorems & Definitions (1)

Theorem 1: Kolmogorov-Arnold Representation Theorem

KAE: Kolmogorov-Arnold Auto-Encoder for Representation Learning

TL;DR

Abstract

KAE: Kolmogorov-Arnold Auto-Encoder for Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (1)