Computing Strategic Responses to Non-Linear Classifiers
Jack Geary, Boyan Gao, Henry Gouk
TL;DR
This work tackles strategic classification for nonlinear classifiers by introducing a Lagrangian-dual formulation to compute best responses to manipulated inputs. The method reproduces linear best responses exactly and offers efficient, scalable approximations for nonlinear models, enabling robust evaluation and training. Empirical results on toy and real data (including twin moons and GiveMeSomeCredit) demonstrate improved strategic robustness with the LD approach over gradient-based alternatives, albeit with some trade-offs in unperturbed accuracy. Overall, the paper provides a principled framework for designing strategically robust nonlinear classifiers in sociotechnical settings.
Abstract
We consider the problem of strategic classification, where the act of deploying a classifier leads to strategic behaviour that induces a distribution shift on subsequent observations. Current approaches to learning classifiers in strategic settings are focused primarily on the linear setting, but in many cases non-linear classifiers are more suitable. A central limitation to progress for non-linear classifiers arises from the inability to compute best responses in these settings. We present a novel method for computing the best response by optimising the Lagrangian dual of the Agents' objective. We demonstrate that our method reproduces best responses in linear settings, identifying key weaknesses in existing approaches. We present further results demonstrating our method can be straight-forwardly applied to non-linear classifier settings, where it is useful for both evaluation and training.
