Digging for Decision Trees: A Case Study in Strategy Sampling and Learning
Carlos E. Budde, Pedro R. D'Argenio, Arnd Hartmanns
TL;DR
This work tackles optimizing open-pit mine transport under stochastic timing by modeling the system as Markov automata ($MA$) and seeking the maximum time-bounded expected reward. It shows discretisation-based probabilistic model checking is infeasible for this objective and proposes statistical model checking (SMC) with lightweight strategy sampling (LSS) and table-based Q-learning (QL), augmented with partial observability and a dtControl-driven decision-tree interface for explainability. Experiments on a new open-pit mining case study show that LSS generally outperforms QL, that feature selection improves results, and that the resulting decision trees provide interpretable guidance for operators; QL often struggles with large models and memory. The contributions include (i) a new MA-based mining case study, (ii) practical tool extensions enabling partial observability, constant-memory strategy extraction, and explainable trees, and (iii) a benchmark demonstrating the trade-offs between LSS and QL for MA optimization.
Abstract
We introduce a formal model of transportation in an open-pit mine for the purpose of optimising the mine's operations. The model is a network of Markov automata (MA); the optimisation goal corresponds to maximising a time-bounded expected reward property. Today's model checking algorithms exacerbate the state space explosion problem by applying a discretisation approach to such properties on MA. We show that model checking is infeasible even for small mine instances. Instead, we propose statistical model checking with lightweight strategy sampling or table-based Q-learning over untimed strategies as an alternative to approach the optimisation task, using the Modest Toolset's modes tool. We add support for partial observability to modes so that strategies can be based on carefully selected model features, and we implement a connection from modes to the dtControl tool to convert sampled or learned strategies into decision trees. We experimentally evaluate the adequacy of our new tooling on the open-pit mine case study. Our experiments demonstrate the limitations of Q-learning, the impact of feature selection, and the usefulness of decision trees as an explainable representation.
