Conformal Prediction for Signal Temporal Logic Inference
Danyang Li, Yixuan Wang, Matthew Cleaveland, Mingyu Cai, Roberto Tron
TL;DR
This work addresses the lack of formal confidence guarantees in STL inference by introducing an end-to-end differentiable conformal-prediction (CP) framework, TLICP, for learning STL formulas from time-series data. It defines a robustness-based, unit-invariant nonconformity score and embeds a smooth CP layer into training, paired with a single-term loss that optimizes both accuracy and CP efficiency; after training, exact CP provides formal guarantees on unseen data. The approach yields tighter, more reliable STL specifications and reduced prediction-set sizes without requiring extensive hyperparameter tuning, outperforming state-of-the-art baselines on benchmark tasks. The method has practical significance for safety-critical domains where interpretable temporal rules with quantifiable uncertainty are essential.
Abstract
Signal Temporal Logic (STL) inference seeks to extract human-interpretable rules from time-series data, but existing methods lack formal confidence guarantees for the inferred rules. Conformal prediction (CP) is a technique that can provide statistical correctness guarantees, but is typically applied as a post-training wrapper without improving model learning. Instead, we introduce an end-to-end differentiable CP framework for STL inference that enhances both reliability and interpretability of the resulting formulas. We introduce a robustness-based nonconformity score, embed a smooth CP layer directly into training, and employ a new loss function that simultaneously optimizes inference accuracy and CP prediction sets with a single term. Following training, an exact CP procedure delivers statistical guarantees for the learned STL formulas. Experiments on benchmark time-series tasks show that our approach reduces uncertainty in predictions (i.e., it achieves high coverage while reducing prediction set size), and improves accuracy (i.e., the number of misclassifications when using a fixed threshold) over state-of-the-art baselines.
