Simulating classification models to evaluate Predict-Then-Optimize methods
Pieter Smet
TL;DR
The paper develops and validates simulation-based methods to assess how classification prediction errors affect Predict-Then-Optimize decisions, extending prior binaryclassifier approaches to multiclass settings. Using a single-machine scheduling problem, it shows that the relationship between predictive accuracy and decision quality is nuanced and highly instance-dependent, and demonstrates that simulation can estimate performance requirements without training numerous models. The findings support using simulation to guide model selection and performance targets before deployment, while highlighting the role of error structure. The work also outlines future directions, including improved multiclass simulation and integration with decision-focused learning paradigms.
Abstract
Uncertainty in optimization is often represented as stochastic parameters in the optimization model. In Predict-Then-Optimize approaches, predictions of a machine learning model are used as values for such parameters, effectively transforming the stochastic optimization problem into a deterministic one. This two-stage framework is built on the assumption that more accurate predictions result in solutions that are closer to the actual optimal solution. However, providing evidence for this assumption in the context of complex, constrained optimization problems is challenging and often overlooked in the literature. Simulating predictions of machine learning models offers a way to (experimentally) analyze how prediction error impacts solution quality without the need to train real models. Complementing an algorithm from the literature for simulating binary classification, we introduce a new algorithm for simulating predictions of multiclass classifiers. We conduct a computational study to evaluate the performance of these algorithms, and show that classifier performance can be simulated with reasonable accuracy, although some variability is observed. Additionally, we apply these algorithms to assess the performance of a Predict-Then-Optimize algorithm for a machine scheduling problem. The experiments demonstrate that the relationship between prediction error and how close solutions are to the actual optimum is non-trivial, highlighting important considerations for the design and evaluation of decision-making systems based on machine learning predictions.
