Strategic inputs: feature selection from game-theoretic perspective
Chi Zhao, Jing Liu, Elena Parilina
TL;DR
The paper tackles the computational burden of feature selection on large-scale tabular data by introducing an end-to-end framework that uses a two-stage pipeline: diversity-based data sampling to reduce evaluation cost and a cooperative-game-theoretic approach to quantify feature importance without retraining. It adapts ShapG and Permutation Feature Importance within a graph-based interaction model and moves importance evaluation to prediction time, leveraging CIS/Shapley-based allocations to select a robust feature subset. Empirical evaluation on 10 TabM benchmark datasets shows that the method substantially reduces training time while preserving predictive performance, with ShapG favoring classification speed and PFI offering stability in regression tasks. The work highlights a practical, scalable approach to feature selection for large tabular datasets and outlines avenues for broader comparisons and applications.
Abstract
The exponential growth of data volumes has led to escalating computational costs in machine learning model training. However, many features fail to contribute positively to model performance while consuming substantial computational resources. This paper presents an end-to-end feature selection framework for tabular data based on game theory. We formulate feature selection procedure based on a cooperative game where features are modeled as players, and their importance is determined through the evaluation of synergistic interactions and marginal contributions. The proposed framework comprises four core components: sample selection, game-theoretic feature importance evaluation, redundant feature elimination, and optimized model training. Experimental results demonstrate that the proposed method achieves substantial computation reduction while preserving predictive performance, thereby offering an efficient solution of the computational challenges of large-scale machine learning. The source code is available at https://github.com/vectorsss/strategy_inputs.
