Table of Contents
Fetching ...

CG-FKAN: Compressed-Grid Federated Kolmogorov-Arnold Networks for Communication Constrained Environment

Seunghun Yu, Youngjoon Lee, Jinu Gong, Joonhyuk Kang

TL;DR

Federated learning with Kolmogorov–Arnol’d Networks (KAN) offers interpretability but suffers from grid-extension induced communication overhead. This paper introduces CG-FKAN, which sparsifies spline coefficients to fit a fixed uplink budget, preserving informative components of the grid while reducing data transmission. A theoretical bound shows the sparsified error is controlled relative to the optimal sparsification, and experiments demonstrate up to 13.6% RMSE improvement over fixed-grid KAN with substantial communication savings, with performance approaching grid-extended KAN. The approach is robust to data heterogeneity and offers a practical, communication-efficient solution for FL with transparent spline-based models.

Abstract

Federated learning (FL), widely used in privacy-critical applications, suffers from limited interpretability, whereas Kolmogorov-Arnold Networks (KAN) address this limitation via learnable spline functions. However, existing FL studies applying KAN overlook the communication overhead introduced by grid extension, which is essential for modeling complex functions. In this letter, we propose CG-FKAN, which compresses extended grids by sparsifying and transmitting only essential coefficients under a communication budget. Experiments show that CG-FKAN achieves up to 13.6% lower RMSE than fixed-grid KAN in communication-constrained settings. In addition, we derive a theoretical upper bound on its approximation error.

CG-FKAN: Compressed-Grid Federated Kolmogorov-Arnold Networks for Communication Constrained Environment

TL;DR

Federated learning with Kolmogorov–Arnol’d Networks (KAN) offers interpretability but suffers from grid-extension induced communication overhead. This paper introduces CG-FKAN, which sparsifies spline coefficients to fit a fixed uplink budget, preserving informative components of the grid while reducing data transmission. A theoretical bound shows the sparsified error is controlled relative to the optimal sparsification, and experiments demonstrate up to 13.6% RMSE improvement over fixed-grid KAN with substantial communication savings, with performance approaching grid-extended KAN. The approach is robust to data heterogeneity and offers a practical, communication-efficient solution for FL with transparent spline-based models.

Abstract

Federated learning (FL), widely used in privacy-critical applications, suffers from limited interpretability, whereas Kolmogorov-Arnold Networks (KAN) address this limitation via learnable spline functions. However, existing FL studies applying KAN overlook the communication overhead introduced by grid extension, which is essential for modeling complex functions. In this letter, we propose CG-FKAN, which compresses extended grids by sparsifying and transmitting only essential coefficients under a communication budget. Experiments show that CG-FKAN achieves up to 13.6% lower RMSE than fixed-grid KAN in communication-constrained settings. In addition, we derive a theoretical upper bound on its approximation error.

Paper Structure

This paper contains 15 sections, 19 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Comparison of the required data between the grid-extended KAN and CG-FKAN in a regression task. The figure shows the required data transmitted by all participating clients in each round, with the bottom bar indicating the initial grid and the stacked bars showing data transmitted to the server as the grid size increases.
  • Figure 2: Overview of CG-FKAN. At round $t$, each client $k$ trains its local KAN model and performs sparsification by dropping coefficients in all spline functions. To this end, it retains only coefficients that minimize pairwise differences from neighboring coefficients, thereby preserving the most informative variations. Only the remaining coefficients are transmitted to the server as the parameter vector ${w}_k^{(t)}$. Then, the server aggregates the sparsified updates from clients to construct the global model ${w}_g^{(t)}$.
  • Figure 3: RMSE results of CG-FKAN compared with baselines across four benchmark functions (Feynman eq. I.30.3, Feynman eq. I.37.4, Bessel, and Legendre functions).
  • Figure 4: Upper: Required bits per round to transmit parameters to the server during regression tasks on the Feynman equation I.30.3. Lower: Approximation error versus sparsity ratio for the case $g=10$ and $o=3$, comparing different sparsification methods.