Table of Contents
Fetching ...

Beyond KAN: Introducing KarSein for Adaptive High-Order Feature Interaction Modeling in CTR Prediction

Yunxiao Shi, Wujiang Xu, Haimin Zhang, Qiang Wu, Min Xu

TL;DR

<3-5 sentence high-level summary> KarSein addresses the challenge of modeling high-order feature interactions in CTR prediction without incurring the prohibitive cost of explicit cross-feature enumeration. It extends Kolmogorov–Arnold Networks to vector-valued inputs, introducing a lightweight KarSein Interaction Layer with a pairwise multiplication catalyst, a single spline-based activation per feature, and a residual linear head, while jointly modeling implicit bit-wise interactions in a parallel path. The approach achieves state-of-the-art CTR accuracy across four real-world datasets with dramatically fewer parameters and demonstrates strong interpretability through symbolic regression and structural sparsity. The results establish KarSein as a practical, scalable, and interpretable alternative for high-order interaction modeling in CTR and related tasks.

Abstract

Modeling high-order feature interactions is crucial for click-through rate (CTR) prediction, yet traditional approaches typically predefine a maximum interaction order and exhaustively enumerate feature combinations up to that order. This paradigm depends heavily on prior domain knowledge to delimit the interaction space and incurs substantial computational overhead. As a result, conventional CTR models face a persistent tension between enriching representations with complex high-order interactions and keeping computation tractable. To address this dual challenge, this study introduces the Kolmogorov-Arnold Represented Sparse Efficient Interaction Network (KarSein). Drawing inspiration from the learnable activation mechanism in the Kolmogorov-Arnold Network (KAN), KarSein leverages this mechanism to adaptively transform low-order basic features into high-order feature interactions, offering a novel approach to feature interaction modeling. KarSein extends the capabilities of KAN by introducing a more efficient architecture that significantly reduces computational costs while accommodating two-dimensional embedding vectors as feature inputs. Furthermore, it overcomes the limitation of KAN's its inability to spontaneously capture multiplicative relationships among features. Extensive experiments highlight the superiority of KarSein, demonstrating its ability to surpass not only the vanilla implementation of KAN in CTR prediction tasks but also other baseline methods. Remarkably, KarSein achieves exceptional predictive accuracy while maintaining a highly compact parameter size and minimal computational overhead. Moreover, KarSein exhibits strong interpretability and structural sparsity. As the first systematic adaptation of KAN to CTR prediction, KarSein offers a practical, parameter-efficient, and interpretable alternative for modeling complex feature interactions in CTR prediction.

Beyond KAN: Introducing KarSein for Adaptive High-Order Feature Interaction Modeling in CTR Prediction

TL;DR

<3-5 sentence high-level summary> KarSein addresses the challenge of modeling high-order feature interactions in CTR prediction without incurring the prohibitive cost of explicit cross-feature enumeration. It extends Kolmogorov–Arnold Networks to vector-valued inputs, introducing a lightweight KarSein Interaction Layer with a pairwise multiplication catalyst, a single spline-based activation per feature, and a residual linear head, while jointly modeling implicit bit-wise interactions in a parallel path. The approach achieves state-of-the-art CTR accuracy across four real-world datasets with dramatically fewer parameters and demonstrates strong interpretability through symbolic regression and structural sparsity. The results establish KarSein as a practical, scalable, and interpretable alternative for high-order interaction modeling in CTR and related tasks.

Abstract

Modeling high-order feature interactions is crucial for click-through rate (CTR) prediction, yet traditional approaches typically predefine a maximum interaction order and exhaustively enumerate feature combinations up to that order. This paradigm depends heavily on prior domain knowledge to delimit the interaction space and incurs substantial computational overhead. As a result, conventional CTR models face a persistent tension between enriching representations with complex high-order interactions and keeping computation tractable. To address this dual challenge, this study introduces the Kolmogorov-Arnold Represented Sparse Efficient Interaction Network (KarSein). Drawing inspiration from the learnable activation mechanism in the Kolmogorov-Arnold Network (KAN), KarSein leverages this mechanism to adaptively transform low-order basic features into high-order feature interactions, offering a novel approach to feature interaction modeling. KarSein extends the capabilities of KAN by introducing a more efficient architecture that significantly reduces computational costs while accommodating two-dimensional embedding vectors as feature inputs. Furthermore, it overcomes the limitation of KAN's its inability to spontaneously capture multiplicative relationships among features. Extensive experiments highlight the superiority of KarSein, demonstrating its ability to surpass not only the vanilla implementation of KAN in CTR prediction tasks but also other baseline methods. Remarkably, KarSein achieves exceptional predictive accuracy while maintaining a highly compact parameter size and minimal computational overhead. Moreover, KarSein exhibits strong interpretability and structural sparsity. As the first systematic adaptation of KAN to CTR prediction, KarSein offers a practical, parameter-efficient, and interpretable alternative for modeling complex feature interactions in CTR prediction.
Paper Structure (45 sections, 1 theorem, 37 equations, 9 figures, 6 tables)

This paper contains 45 sections, 1 theorem, 37 equations, 9 figures, 6 tables.

Key Result

Theorem 1

Given a function $f: [0,1]^m \rightarrow \mathbb{R}$, there exist univariate functions $\phi_{q,p} : [0,1] \rightarrow \mathbb{R}$ and functions $\Phi_q : \mathbb{R} \rightarrow \mathbb{R}$ such that:

Figures (9)

  • Figure 1: Comparison between KarSein and pioneer methods. KarSein models high order features interaction simply via activation without space transformation. It is more intuitively explainable and structural sparse.
  • Figure 2: An activation function is parameterized as a B-spline.
  • Figure 3: Visualization of KAN for fitting simple second-order feature interactions across three different settings.
  • Figure 4: Visualization of KAN for CTR prediction.
  • Figure 5: Network Architectures and Visualization of Feature Interaction Modeling in KarSein and KAN. Subfigure (1) illustrates KarSein, where the left branch is the vector-wise architecture for modeling explicit feature interactions and the right branch is the bit-wise architecture for modeling implicit feature interactions. Subfigure (2) shows the KAN baseline. In this toy schematic, the KarSein-explicit branch is configured with an $m$–2–1 architecture, while KarSein-implicit and KAN both use an $mD$–3–1 architecture. For KarSein, we color explicit/implicit feature interactions with a light-to-dark red/green gradient to indicate progressively higher orders. We further annotate the order of explicit feature interactions from 1 order at the input to 18 order at the output, computed by \ref{['eq:order_calculation']} as $2 \times 3^2 = 18$.
  • ...and 4 more figures

Theorems & Definitions (1)

  • Theorem 1