Kernel Banzhaf: A Fast and Robust Estimator for Banzhaf Values
Yurong Liu, R. Teal Witter, Flip Korn, Tarfah Alrashed, Dimitris Paparas, Christopher Musco, Juliana Freire
TL;DR
Kernel Banzhaf introduces a regression-based estimator for Banzhaf feature attributions, addressing the exponential cost of exact computation. By formulating Banzhaf values as the solution to a linear regression and solving a small, paired-sample LS problem, it achieves higher accuracy and sample efficiency than Monte Carlo-based methods, with strong theoretical guarantees. The approach extends to probabilistic values and demonstrates robust performance across eight datasets, including both tree-based and neural models, under noise and adversarial perturbations. This yields a scalable, theoretically-grounded tool for reliable feature attribution in complex models.
Abstract
Banzhaf values provide a popular, interpretable alternative to the widely-used Shapley values for quantifying the importance of features in machine learning models. Like Shapley values, computing Banzhaf values exactly requires time exponential in the number of features, necessitating the use of efficient estimators. Existing estimators, however, are limited to Monte Carlo sampling methods. In this work, we introduce Kernel Banzhaf, the first regression-based estimator for Banzhaf values. Our approach leverages a novel regression formulation, whose exact solution corresponds to the exact Banzhaf values. Inspired by the success of Kernel SHAP for Shapley values, Kernel Banzhaf efficiently solves a sampled instance of this regression problem. Through empirical evaluations across eight datasets, we find that Kernel Banzhaf significantly outperforms existing Monte Carlo methods in terms of accuracy, sample efficiency, robustness to noise, and feature ranking recovery. Finally, we complement our experimental evaluation with strong theoretical guarantees on Kernel Banzhaf's performance.
