Human-in-the-Loop Feature Selection Using Interpretable Kolmogorov-Arnold Network-based Double Deep Q-Network
Md Abrar Jahin, M. F. Mridha, Nilanjan Dey, Md. Jakir Hossen
TL;DR
We address per-instance feature selection in high-dimensional data by integrating simulated human feedback into a DDQN framework with a KAN head. The approach combines Beta-distribution-based stochastic feature gating with a differentiable, interpretable KAN and a standard MLP baseline, achieving up to $93\%$ test accuracy on MNIST and $83\%$ on FashionMNIST with $8\times8$ inputs, while using four times fewer hidden neurons than the MLP baseline. Interpretability is enhanced through pruning, visualization, and symbolification of learned activations, with instance-wise feature subsets providing case-specific explanations. The method scales to CIFAR-10/100, improves macro F1 and calibration, and maintains real-time latency under $1~\mathrm{ms}$, making it suitable for real-time, adaptive decision-making with minimal human oversight. Overall, HITL-KAN-DDQN offers a scalable, interpretable framework for dynamic feature selection that balances accuracy, efficiency, and explainability in resource-constrained settings.
Abstract
Feature selection is critical for improving the performance and interpretability of machine learning models, particularly in high-dimensional spaces where complex feature interactions can reduce accuracy and increase computational demands. Existing approaches often rely on static feature subsets or manual intervention, limiting adaptability and scalability. However, dynamic, per-instance feature selection methods and model-specific interpretability in reinforcement learning remain underexplored. This study proposes a human-in-the-loop (HITL) feature selection framework integrated into a Double Deep Q-Network (DDQN) using a Kolmogorov-Arnold Network (KAN). Our novel approach leverages simulated human feedback and stochastic distribution-based sampling, specifically Beta, to iteratively refine feature subsets per data instance, improving flexibility in feature selection. The KAN-DDQN achieved notable test accuracies of 93% on MNIST and 83% on FashionMNIST, outperforming conventional MLP-DDQN models by up to 9%. The KAN-based model provided high interpretability via symbolic representation while using 4 times fewer neurons in the hidden layer than MLPs did. Comparatively, the models without feature selection achieved test accuracies of only 58% on MNIST and 64% on FashionMNIST, highlighting significant gains with our framework. We further validate scalability on CIFAR-10 and CIFAR-100, achieving up to 30% relative macro F1 improvement on MNIST and 5% on CIFAR-10, while reducing calibration error by 25%. Complexity analysis confirms real-time feasibility with latency below 1 ms and parameter counts under 0.02M. Pruning and visualization further enhanced model transparency by elucidating decision pathways. These findings present a scalable, interpretable solution for feature selection that is suitable for applications requiring real-time, adaptive decision-making with minimal human oversight.
