Incentivizing Desirable Effort Profiles in Strategic Classification: The Role of Causality and Uncertainty
Valia Efthymiou, Chara Podimata, Diptangshu Sen, Juba Ziani
TL;DR
This paper studies strategic classification in settings where features interact causally, introducing beta-desirability to quantify investment in desirable features. It develops a causal framework with a contribution matrix $\mathbb{C}$ and analyzes agent best responses under complete information (convex optimization for $\ell_p$ costs, with single-feature incentives yielding convex design) and incomplete information (Gaussian priors yield tractable partial-information cases but non-convexity under full uncertainty). The work provides structural results for $\ell_1$ and $\ell_p$ costs, convexification tools to bound unwanted effort, and semi-closed-form solutions for $\ell_2$ costs under partial information, complemented by a cardiovascular-disease risk study demonstrating how to incentivize desirable modifications under uncertainty. The findings offer practical guidance for classifier design that promotes real, desirable changes while accounting for causality and information asymmetry, with implications for healthcare, finance, and risk management.
Abstract
We study strategic classification in binary decision-making settings where agents can modify their features in order to improve their classification outcomes. Importantly, our work considers the causal structure across different features, acknowledging that effort in a given feature may affect other features. The main goal of our work is to understand \emph{when and how much agent effort is invested towards desirable features}, and how this is influenced by the deployed classifier, the causal structure of the agent's features, their ability to modify them, and the information available to the agent about the classifier and the feature causal graph. In the complete information case, when agents know the classifier and the causal structure of the problem, we derive conditions ensuring that rational agents focus on features favored by the principal. We show that designing classifiers to induce desirable behavior is generally non-convex, though tractable in special cases. We also extend our analysis to settings where agents have incomplete information about the classifier or the causal graph. While optimal effort selection is again a non-convex problem under general uncertainty, we highlight special cases of partial uncertainty where this selection problem becomes tractable. Our results indicate that uncertainty drives agents to favor features with higher expected importance and lower variance, potentially misaligning with principal preferences. Finally, numerical experiments based on a cardiovascular disease risk study illustrate how to incentivize desirable modifications under uncertainty.
