Learning Optimal Signal Temporal Logic Decision Trees for Classification: A Max-Flow MILP Formulation
Kaier Liang, Gustavo A. Cardona, Disha Kamale, Cristian-Ioan Vasile
TL;DR
The paper tackles the challenge of extracting interpretable temporal properties from labeled time-series by learning STL-based decision trees. It introduces a max-flow encoded MILP framework that jointly infers tree structure and PSTL primitives, with robustness- and flow-based constraints ensuring correct classification and interpretable formulae. Key contributions include a global optimization approach that reduces primitive redundancy, a flow-based formulation with complexity control via a regularization parameter, and demonstration on naval surveillance, trace datasets, and second-level STL primitives. The results show high classification accuracy and interpretable STL rules, highlighting the method's potential for scalable, interpretable temporal behavior analysis in real-world settings.
Abstract
This paper presents a novel framework for inferring timed temporal logic properties from data. The dataset comprises pairs of finite-time system traces and corresponding labels, denoting whether the traces demonstrate specific desired behaviors, e.g. whether the ship follows a safe route or not. Our proposed approach leverages decision-tree-based methods to infer Signal Temporal Logic classifiers using primitive formulae. We formulate the inference process as a mixed integer linear programming optimization problem, recursively generating constraints to determine both data classification and tree structure. Applying a max-flow algorithm on the resultant tree transforms the problem into a global optimization challenge, leading to improved classification rates compared to prior methodologies. Moreover, we introduce a technique to reduce the number of constraints by exploiting the symmetry inherent in STL primitives, which enhances the algorithm's time performance and interpretability. To assess our algorithm's effectiveness and classification performance, we conduct three case studies involving two-class, multi-class, and complex formula classification scenarios.
