Table of Contents
Fetching ...

Strategic Classification with Randomised Classifiers

Jack Geary, Henry Gouk

TL;DR

This work extends strategic classification to randomised learners by placing a distribution $Q$ over classifiers and using a Gibbs-style test-time decision. It proves that the optimal randomised classifier can strictly outperform the best deterministic one under mild sufficient conditions, while never being worse, and shows that SERM generalises to randomised classes with excess-risk bounds that match the i.i.d. rate via Rademacher complexity. The paper also connects these results to prior strategic classification literature, clarifying when randomisation provides benefits and outlining practical considerations and open questions for algorithmic training. Overall, randomisation emerges as a theoretically sound and potentially robust approach to mitigating gaming in strategic settings, with comparable data requirements to deterministic learning.

Abstract

We consider the problem of strategic classification, where a learner must build a model to classify agents based on features that have been strategically modified. Previous work in this area has concentrated on the case when the learner is restricted to deterministic classifiers. In contrast, we perform a theoretical analysis of an extension to this setting that allows the learner to produce a randomised classifier. We show that, under certain conditions, the optimal randomised classifier can achieve better accuracy than the optimal deterministic classifier, but under no conditions can it be worse. When a finite set of training data is available, we show that the excess risk of Strategic Empirical Risk Minimisation over the class of randomised classifiers is bounded in a similar manner as the deterministic case. In both the deterministic and randomised cases, the risk of the classifier produced by the learner converges to that of the corresponding optimal classifier as the volume of available training data grows. Moreover, this convergence happens at the same rate as in the i.i.d. case. Our findings are compared with previous theoretical work analysing the problem of strategic classification. We conclude that randomisation has the potential to alleviate some issues that could be faced in practice without introducing any substantial downsides.

Strategic Classification with Randomised Classifiers

TL;DR

This work extends strategic classification to randomised learners by placing a distribution over classifiers and using a Gibbs-style test-time decision. It proves that the optimal randomised classifier can strictly outperform the best deterministic one under mild sufficient conditions, while never being worse, and shows that SERM generalises to randomised classes with excess-risk bounds that match the i.i.d. rate via Rademacher complexity. The paper also connects these results to prior strategic classification literature, clarifying when randomisation provides benefits and outlining practical considerations and open questions for algorithmic training. Overall, randomisation emerges as a theoretically sound and potentially robust approach to mitigating gaming in strategic settings, with comparable data requirements to deterministic learning.

Abstract

We consider the problem of strategic classification, where a learner must build a model to classify agents based on features that have been strategically modified. Previous work in this area has concentrated on the case when the learner is restricted to deterministic classifiers. In contrast, we perform a theoretical analysis of an extension to this setting that allows the learner to produce a randomised classifier. We show that, under certain conditions, the optimal randomised classifier can achieve better accuracy than the optimal deterministic classifier, but under no conditions can it be worse. When a finite set of training data is available, we show that the excess risk of Strategic Empirical Risk Minimisation over the class of randomised classifiers is bounded in a similar manner as the deterministic case. In both the deterministic and randomised cases, the risk of the classifier produced by the learner converges to that of the corresponding optimal classifier as the volume of available training data grows. Moreover, this convergence happens at the same rate as in the i.i.d. case. Our findings are compared with previous theoretical work analysing the problem of strategic classification. We conclude that randomisation has the potential to alleviate some issues that could be faced in practice without introducing any substantial downsides.

Paper Structure

This paper contains 15 sections, 12 theorems, 66 equations, 1 figure.

Key Result

Theorem 1

If $R_\Delta^\ast > 0$ and there exists $f, f^\prime \in {\mathcal{F}}^\ast$ such that and then, so long as at least one of the inequalities is strict, we have

Figures (1)

  • Figure 1: Comparing gaming behaviour for two deterministic classifiers, $f$ and $f^\prime$, and a randomised classifier defined as a uniform distribution over $f$ and $f^\prime$. Points to be classified are in a circular positive class region (green), surrounded by a negative class disc (red). Classes are uniformly distributed ($P(y=-1)=P(y=1)$) and data are uniformly distributed within each region. (Left) Quadratic classifiers, $f,f^\prime \in {\mathcal{F}}^\ast$. (Middle) Highlighting $G_{f}$ (blue), the region around $f$ where gaming is possible; subfigures depict $G_{f}, G_{f'}$ for $f, f' \in {\mathcal{F}}^\ast$. (Right) Highlighting the region where gaming is possible for the randomised classifier. Reduced opacity indicates reduced utility from gaming due to randomisation. The red and green cross-hatched areas identify $\{x \in E_{f}, y=-1\} \cup \{x \in E_{f'}, y=-1\}$ and $\{x\in E_{f}, y=1\} \cup \{x \in E_{f'}, y=1\}$ respectively.

Theorems & Definitions (19)

  • Theorem 1
  • Definition 1: Rademacher Complexity
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5: sundaram2023
  • Theorem 6: rosenfeld2023
  • Corollary 1
  • proof
  • Theorem 6
  • ...and 9 more