CF-KAN: Kolmogorov-Arnold Network-based Collaborative Filtering to Mitigate Catastrophic Forgetting in Recommender Systems

Jin-Duk Park; Kyung-Min Kim; Won-Yong Shin

CF-KAN: Kolmogorov-Arnold Network-based Collaborative Filtering to Mitigate Catastrophic Forgetting in Recommender Systems

Jin-Duk Park, Kyung-Min Kim, Won-Yong Shin

TL;DR

The paper tackles catastrophic forgetting in recommender systems by replacing fixed MLP activations with Kolmogorov-Arnold networks (KANs) that learn edge-level nonlinearities. CF-KAN builds a KAN-based autoencoder to model sparse user--item interactions, enabling robust continual learning and interpretability via pruning. The approach achieves state-of-the-art recall and NDCG on ML-1M, Yelp, and Anime, with gains up to up to 8.2% over strong baselines, while maintaining faster training times than two-tower models. By grounding activations in the Kolmogorov–Arnol’d representation $f({f x}) = \sum_{q=1}^{2n+1} \Phi_q\big( \sum_{p=1}^n \phi_{q,p}(x_p) \big)$, CF-KAN demonstrates that edge-level learning can balance plasticity and stability in dynamic recommendation scenarios and offers interpretable, sparse explanations of recommendations.

Abstract

Collaborative filtering (CF) remains essential in recommender systems, leveraging user--item interactions to provide personalized recommendations. Meanwhile, a number of CF techniques have evolved into sophisticated model architectures based on multi-layer perceptrons (MLPs). However, MLPs often suffer from catastrophic forgetting, and thus lose previously acquired knowledge when new information is learned, particularly in dynamic environments requiring continual learning. To tackle this problem, we propose CF-KAN, a new CF method utilizing Kolmogorov-Arnold networks (KANs). By learning nonlinear functions on the edge level, KANs are more robust to the catastrophic forgetting problem than MLPs. Built upon a KAN-based autoencoder, CF-KAN is designed in the sense of effectively capturing the intricacies of sparse user--item interactions and retaining information from previous data instances. Despite its simplicity, our extensive experiments demonstrate 1) CF-KAN's superiority over state-of-the-art methods in recommendation accuracy, 2) CF-KAN's resilience to catastrophic forgetting, underscoring its effectiveness in both static and dynamic recommendation scenarios, and 3) CF-KAN's edge-level interpretation facilitating the explainability of recommendations.

CF-KAN: Kolmogorov-Arnold Network-based Collaborative Filtering to Mitigate Catastrophic Forgetting in Recommender Systems

TL;DR

, CF-KAN demonstrates that edge-level learning can balance plasticity and stability in dynamic recommendation scenarios and offers interpretable, sparse explanations of recommendations.

Abstract

Paper Structure (25 sections, 1 theorem, 10 equations, 7 figures, 4 tables)

This paper contains 25 sections, 1 theorem, 10 equations, 7 figures, 4 tables.

1. Introduction
2. Methodology
2.1. Kolmogorov-Arnorld Network
2.2. CF-KAN
Notation.
Model architecture.
Optimization.
Application to continual learning.
Interpretability.
3. Experiments
3.1. Experimental Settings
Datasets.
Evaluation protocol.
Competitors.
Implementation details.
...and 10 more sections

Key Result

Theorem 1

Let $f$ be a multivariate continuous function on a bounded domain. Then, $f$ can be represented as a finite composition of two argument addition of continuous functions of a single variable. Specifically, for a smooth function $f : [0, 1]^n \to \mathbb{R}$, it holds that where $\phi_{q,p} : [0, 1] \to \mathbb{R}$ and $\Phi_q : \mathbb{R} \to \mathbb{R}$ are continuous functions.

Figures (7)

Figure 1: Comparison of CF-KAN and benchmark methods in terms of accuracy, training time, and memory consumption on the Anime dataset.
Figure 2: The schematic overview of CF-KAN.
Figure 3: Heatmap visualization of model parameter variations for both CF-KAN and CF-MLP over the training steps on the MovieLens-1M dataset, when $h$ is set to 10 and 10 items are sampled for visualization. Each entry $(q,p)$ in the heatmap represents the variation of parameters $c_0$'s in $\phi_{q,p}$ of CF-KAN and the variation of parameters in the weight matrix of CF-MLP.
Figure 4: A toy example of interpretations where CF-KAN explains why pizza is recommended to a given user based on his/her past consumption ( i.e., hamburger and Sprite).
Figure 5: Performance comparison of CF-KAN and CF-MLP in terms of R@20 on the ML-1M dataset in the continual learning scenario.
...and 2 more figures

Theorems & Definitions (1)

Theorem 1: KA Representation Theorem

CF-KAN: Kolmogorov-Arnold Network-based Collaborative Filtering to Mitigate Catastrophic Forgetting in Recommender Systems

TL;DR

Abstract

CF-KAN: Kolmogorov-Arnold Network-based Collaborative Filtering to Mitigate Catastrophic Forgetting in Recommender Systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (1)