Table of Contents
Fetching ...

Human-in-the-Loop Feature Selection Using Interpretable Kolmogorov-Arnold Network-based Double Deep Q-Network

Md Abrar Jahin, M. F. Mridha, Nilanjan Dey, Md. Jakir Hossen

TL;DR

We address per-instance feature selection in high-dimensional data by integrating simulated human feedback into a DDQN framework with a KAN head. The approach combines Beta-distribution-based stochastic feature gating with a differentiable, interpretable KAN and a standard MLP baseline, achieving up to $93\%$ test accuracy on MNIST and $83\%$ on FashionMNIST with $8\times8$ inputs, while using four times fewer hidden neurons than the MLP baseline. Interpretability is enhanced through pruning, visualization, and symbolification of learned activations, with instance-wise feature subsets providing case-specific explanations. The method scales to CIFAR-10/100, improves macro F1 and calibration, and maintains real-time latency under $1~\mathrm{ms}$, making it suitable for real-time, adaptive decision-making with minimal human oversight. Overall, HITL-KAN-DDQN offers a scalable, interpretable framework for dynamic feature selection that balances accuracy, efficiency, and explainability in resource-constrained settings.

Abstract

Feature selection is critical for improving the performance and interpretability of machine learning models, particularly in high-dimensional spaces where complex feature interactions can reduce accuracy and increase computational demands. Existing approaches often rely on static feature subsets or manual intervention, limiting adaptability and scalability. However, dynamic, per-instance feature selection methods and model-specific interpretability in reinforcement learning remain underexplored. This study proposes a human-in-the-loop (HITL) feature selection framework integrated into a Double Deep Q-Network (DDQN) using a Kolmogorov-Arnold Network (KAN). Our novel approach leverages simulated human feedback and stochastic distribution-based sampling, specifically Beta, to iteratively refine feature subsets per data instance, improving flexibility in feature selection. The KAN-DDQN achieved notable test accuracies of 93% on MNIST and 83% on FashionMNIST, outperforming conventional MLP-DDQN models by up to 9%. The KAN-based model provided high interpretability via symbolic representation while using 4 times fewer neurons in the hidden layer than MLPs did. Comparatively, the models without feature selection achieved test accuracies of only 58% on MNIST and 64% on FashionMNIST, highlighting significant gains with our framework. We further validate scalability on CIFAR-10 and CIFAR-100, achieving up to 30% relative macro F1 improvement on MNIST and 5% on CIFAR-10, while reducing calibration error by 25%. Complexity analysis confirms real-time feasibility with latency below 1 ms and parameter counts under 0.02M. Pruning and visualization further enhanced model transparency by elucidating decision pathways. These findings present a scalable, interpretable solution for feature selection that is suitable for applications requiring real-time, adaptive decision-making with minimal human oversight.

Human-in-the-Loop Feature Selection Using Interpretable Kolmogorov-Arnold Network-based Double Deep Q-Network

TL;DR

We address per-instance feature selection in high-dimensional data by integrating simulated human feedback into a DDQN framework with a KAN head. The approach combines Beta-distribution-based stochastic feature gating with a differentiable, interpretable KAN and a standard MLP baseline, achieving up to test accuracy on MNIST and on FashionMNIST with inputs, while using four times fewer hidden neurons than the MLP baseline. Interpretability is enhanced through pruning, visualization, and symbolification of learned activations, with instance-wise feature subsets providing case-specific explanations. The method scales to CIFAR-10/100, improves macro F1 and calibration, and maintains real-time latency under , making it suitable for real-time, adaptive decision-making with minimal human oversight. Overall, HITL-KAN-DDQN offers a scalable, interpretable framework for dynamic feature selection that balances accuracy, efficiency, and explainability in resource-constrained settings.

Abstract

Feature selection is critical for improving the performance and interpretability of machine learning models, particularly in high-dimensional spaces where complex feature interactions can reduce accuracy and increase computational demands. Existing approaches often rely on static feature subsets or manual intervention, limiting adaptability and scalability. However, dynamic, per-instance feature selection methods and model-specific interpretability in reinforcement learning remain underexplored. This study proposes a human-in-the-loop (HITL) feature selection framework integrated into a Double Deep Q-Network (DDQN) using a Kolmogorov-Arnold Network (KAN). Our novel approach leverages simulated human feedback and stochastic distribution-based sampling, specifically Beta, to iteratively refine feature subsets per data instance, improving flexibility in feature selection. The KAN-DDQN achieved notable test accuracies of 93% on MNIST and 83% on FashionMNIST, outperforming conventional MLP-DDQN models by up to 9%. The KAN-based model provided high interpretability via symbolic representation while using 4 times fewer neurons in the hidden layer than MLPs did. Comparatively, the models without feature selection achieved test accuracies of only 58% on MNIST and 64% on FashionMNIST, highlighting significant gains with our framework. We further validate scalability on CIFAR-10 and CIFAR-100, achieving up to 30% relative macro F1 improvement on MNIST and 5% on CIFAR-10, while reducing calibration error by 25%. Complexity analysis confirms real-time feasibility with latency below 1 ms and parameter counts under 0.02M. Pruning and visualization further enhanced model transparency by elucidating decision pathways. These findings present a scalable, interpretable solution for feature selection that is suitable for applications requiring real-time, adaptive decision-making with minimal human oversight.

Paper Structure

This paper contains 54 sections, 25 equations, 8 figures, 7 tables, 2 algorithms.

Figures (8)

  • Figure 1: System overview. The feature-selection head (FSNet) maps an input image to per-feature probabilities via a differentiable stochastic gate aligned to simulated feedback. The selected features are fed to a DDQN head (KAN or MLP). Replay-based training updates the Q-network, with periodic target sync. The gate induces instance-wise sparsity and exposes feature-level rationale, while the KAN head provides model-specific interpretability.
  • Figure 2: A sample of preprocessed ($8 \times 8$) pixel MNIST (left) and FashionMNIST (right) images and their corresponding feedback maps ($\sigma = 5.0$).
  • Figure 3: (a) Training accuracy per epoch comparison of KAN and MLP-based DDQN on MNIST (left) and FashionMNIST (right), and (b) confusion matrix of KAN-DDQN on MNIST (left) and FashionMNIST (right).
  • Figure 4: (a) Pruned KAN architecture after removing neurons with both low incoming and outgoing activation magnitudes, exposing a minimal functional skeleton; (b) feature importances extracted from auto-symbolic forms of learned splines, indicating how often input features appear in simplified expressions; (c) early training trajectory (first five steps), where the policy concentrates mass on discriminative inputs before spreading to additional features. Together, these link sparsity, symbolic structure, and policy behavior.
  • Figure 5: Activation functions for middle neurons in the KAN agent's hidden layer on the MNIST dataset. Each subplot shows how a specific neuron transforms inputs via the spline activation, illustrating various activation behaviors across the layer.
  • ...and 3 more figures