Table of Contents
Fetching ...

A Genetic Algorithm-Based Approach for Automated Optimization of Kolmogorov-Arnold Networks in Classification Tasks

Quan Long, Bin Wang, Bing Xue, Mengjie Zhang

TL;DR

This work introduces GA-KAN, a genetic algorithm-based framework to automatically optimize Kolmogorov-Arnold Networks for classification, incorporating sparse connectivity and grid-value search to replace manual pruning and tuning. It presents a novel encoding/decoding scheme with a degradation mechanism and zero masks to enable variable-depth KANs, and evaluates fitness via LBFGS training, yielding accurate and highly interpretable models. Across five UCI datasets and toy benchmarks, GA-KAN achieves competitive or superior accuracy and AUC while substantially reducing parameter counts, and provides symbolic formulas and feature attribution to enhance transparency. The approach reduces human design effort and demonstrates potential for neural architecture search in interpretable, parameter-efficient KANs, with future work aimed at regression tasks and scalable deployment.

Abstract

To address the issue of interpretability in multilayer perceptrons (MLPs), Kolmogorov-Arnold Networks (KANs) are introduced in 2024. However, optimizing KAN structures is labor-intensive, typically requiring manual intervention and parameter tuning. This paper proposes GA-KAN, a genetic algorithm-based approach that automates the optimization of KANs, requiring no human intervention in the design process. To the best of our knowledge, this is the first time that evolutionary computation is explored to optimize KANs automatically. Furthermore, inspired by the use of sparse connectivity in MLPs in effectively reducing the number of parameters, GA-KAN further explores sparse connectivity to tackle the challenge of extensive parameter spaces in KANs. GA-KAN is validated on two toy datasets, achieving optimal results without the manual tuning required by the original KAN. Additionally, GA-KAN demonstrates superior performance across five classification datasets, outperforming traditional methods on all datasets and providing interpretable symbolic formulae for the Wine and Iris datasets, thereby enhancing model transparency. Furthermore, GA-KAN significantly reduces the number of parameters over the standard KAN across all the five datasets. The core contributions of GA-KAN include automated optimization, a new encoding strategy, and a new decoding process, which together improve the accuracy and interpretability, and reduce the number of parameters.

A Genetic Algorithm-Based Approach for Automated Optimization of Kolmogorov-Arnold Networks in Classification Tasks

TL;DR

This work introduces GA-KAN, a genetic algorithm-based framework to automatically optimize Kolmogorov-Arnold Networks for classification, incorporating sparse connectivity and grid-value search to replace manual pruning and tuning. It presents a novel encoding/decoding scheme with a degradation mechanism and zero masks to enable variable-depth KANs, and evaluates fitness via LBFGS training, yielding accurate and highly interpretable models. Across five UCI datasets and toy benchmarks, GA-KAN achieves competitive or superior accuracy and AUC while substantially reducing parameter counts, and provides symbolic formulas and feature attribution to enhance transparency. The approach reduces human design effort and demonstrates potential for neural architecture search in interpretable, parameter-efficient KANs, with future work aimed at regression tasks and scalable deployment.

Abstract

To address the issue of interpretability in multilayer perceptrons (MLPs), Kolmogorov-Arnold Networks (KANs) are introduced in 2024. However, optimizing KAN structures is labor-intensive, typically requiring manual intervention and parameter tuning. This paper proposes GA-KAN, a genetic algorithm-based approach that automates the optimization of KANs, requiring no human intervention in the design process. To the best of our knowledge, this is the first time that evolutionary computation is explored to optimize KANs automatically. Furthermore, inspired by the use of sparse connectivity in MLPs in effectively reducing the number of parameters, GA-KAN further explores sparse connectivity to tackle the challenge of extensive parameter spaces in KANs. GA-KAN is validated on two toy datasets, achieving optimal results without the manual tuning required by the original KAN. Additionally, GA-KAN demonstrates superior performance across five classification datasets, outperforming traditional methods on all datasets and providing interpretable symbolic formulae for the Wine and Iris datasets, thereby enhancing model transparency. Furthermore, GA-KAN significantly reduces the number of parameters over the standard KAN across all the five datasets. The core contributions of GA-KAN include automated optimization, a new encoding strategy, and a new decoding process, which together improve the accuracy and interpretability, and reduce the number of parameters.

Paper Structure

This paper contains 21 sections, 12 equations, 10 figures, 5 tables, 2 algorithms.

Figures (10)

  • Figure 1: The overall framework of the proposed method, where $t$ denotes the generation, $P_{t}$ represents the parent population, and $Q_{t}$ represents the offspring population. The next parent population, $P_{t+1}$, is selected based on a combination of $P_{t}$ and $Q_{t}$. The dashed box indicates an element, while the solid box highlights a key step in GA.
  • Figure 2: An example of the encoded chromosome. The connections between lower-layer and upper-layer neurons are encoded using 0s and 1s, where 0 indicates no connection and 1 indicates a connection. The encoding of all neurons is organized into a chromosome. Additionally, 6 fixed bits are used to encode the grid value, and 2 fixed bits are used to encode the network depth.
  • Figure 3: An example of the degradation mechanism---A process where a layer is removed if all its connections are inactive, reducing its depth.
  • Figure 4: An example of crossover. Crossover has a probability of occurring between neurons that are located in the same layer and position as those in the parent networks, indicated by the purple color.
  • Figure 5: Results on the two toy datasets.
  • ...and 5 more figures