The foundations of cost-sensitive causal classification
Wouter Verbeke, Diego Olaya, Jeroen Berrevoets, Sam Verboven, Sebastián Maldonado
TL;DR
The paper addresses the gap between cost-sensitive and causal classification by introducing a unified evaluation framework that uses a set of matrices—confusion $ extbf{CF}$, cost-benefit $ extbf{CB}$, and the effect $E$—to quantify performance. It extends the framework to causal, double-binary settings with individual treatment effects and defines causal variants of classical measures (e.g., Qini, liftup) alongside new cost-sensitive causal profits such as the maximum and expected maximum causal profit ($\dot{MP}$, $\dot{EMP}$). The authors demonstrate how to instantiate the framework in business contexts like customer retention and response uplift, yielding actionable formulas for profit per instance and optimal treatment thresholds. This unification enables practitioners to optimize interventions under cost and benefit uncertainty, paving the way for more profit-aware, data-driven decision-making.
Abstract
Classification is a well-studied machine learning task which concerns the assignment of instances to a set of outcomes. Classification models support the optimization of managerial decision-making across a variety of operational business processes. For instance, customer churn prediction models are adopted to increase the efficiency of retention campaigns by optimizing the selection of customers that are to be targeted. Cost-sensitive and causal classification methods have independently been proposed to improve the performance of classification models. The former considers the benefits and costs of correct and incorrect classifications, such as the benefit of a retained customer, whereas the latter estimates the causal effect of an action, such as a retention campaign, on the outcome of interest. This study integrates cost-sensitive and causal classification by elaborating a unifying evaluation framework. The framework encompasses a range of existing and novel performance measures for evaluating both causal and conventional classification models in a cost-sensitive as well as a cost-insensitive manner. We proof that conventional classification is a specific case of causal classification in terms of a range of performance measures when the number of actions is equal to one. The framework is shown to instantiate to application-specific cost-sensitive performance measures that have been recently proposed for evaluating customer retention and response uplift models, and allows to maximize profitability when adopting a causal classification model for optimizing decision-making. The proposed framework paves the way toward the development of cost-sensitive causal learning methods and opens a range of opportunities for improving data-driven business decision-making.
