Counterfactual Explanations for Linear Optimization
Jannis Kurtz, Ş. İlker Birbil, Dick den Hertog
TL;DR
The paper develops a formal framework translating counterfactual explanations to linear optimization, defining mutable parameter space $\\mathcal{H}$ and favored solution space $\\mathcal{D}(\\hat{x})$, and introduces three CE types: weak, strong, and relative. It derives reformulations for each CE (WCEP/WCEP', SCEP/SCEP', RCEP) and demonstrates that relative CEs are often computable via hidden convexity, achieving solutions with the same scale as solving the original LP. Through extensive numerical experiments on the Diet problem and NETLIB instances, the authors show relative CEs can be computed in milliseconds, while weak and strong CEs can be slower or numerically unstable, motivating the practical preference for relative CEs in large-scale settings. The work provides actionable interpretability tools for optimization decisions, highlights practical feasibility and analysis benefits, and identifies promising directions for extending CE concepts to non-linear and integer optimization and algorithm-specific CE designs.
Abstract
The concept of counterfactual explanations (CE) has emerged as one of the important concepts to understand the inner workings of complex AI systems. In this paper, we translate the idea of CEs to linear optimization and propose, motivate, and analyze three different types of CEs: strong, weak, and relative. While deriving strong and weak CEs appears to be computationally intractable, we show that calculating relative CEs can be done efficiently. By detecting and exploiting the hidden convex structure of the optimization problem that arises in the latter case, we show that obtaining relative CEs can be done in the same magnitude of time as solving the original linear optimization problem. This is confirmed by an extensive numerical experiment study on the NETLIB library.
