Table of Contents
Fetching ...

Scalable Anytime Algorithms for Learning Fragments of Linear Temporal Logic

Ritam Raha, Rajarshi Roy, Nathanaël Fijalkow, Daniel Neider

TL;DR

This work tackles the problem of learning explainable LTLf formulas from finite traces by introducing a scalable anytime algorithm that is guaranteed to produce a separating formula, even under limited computational budgets. It decomposes the search space into directed LTLf formulas (dLTL) and combines them with a Boolean set cover-based method to form separating formulas, enabling incremental refinements and approximate solutions. The authors prove that any LTLf formula in the fragment without Until can be expressed as a Boolean combination of directed formulas, and demonstrate the approach with SCARLET, which outperforms state-of-the-art tools on synthetic benchmarks and maintains robustness under noise. The results show strong practical impact for scalable specification mining and explainable trace classification, though the current method omits the Until operator and nesting of F/G, outlining clear directions for future work.

Abstract

Linear temporal logic (LTL) is a specification language for finite sequences (called traces) widely used in program verification, motion planning in robotics, process mining, and many other areas. We consider the problem of learning LTL formulas for classifying traces; despite a growing interest of the research community, existing solutions suffer from two limitations: they do not scale beyond small formulas, and they may exhaust computational resources without returning any result. We introduce a new algorithm addressing both issues: our algorithm is able to construct formulas an order of magnitude larger than previous methods, and it is anytime, meaning that it in most cases successfully outputs a formula, albeit possibly not of minimal size. We evaluate the performances of our algorithm using an open source implementation against publicly available benchmarks.

Scalable Anytime Algorithms for Learning Fragments of Linear Temporal Logic

TL;DR

This work tackles the problem of learning explainable LTLf formulas from finite traces by introducing a scalable anytime algorithm that is guaranteed to produce a separating formula, even under limited computational budgets. It decomposes the search space into directed LTLf formulas (dLTL) and combines them with a Boolean set cover-based method to form separating formulas, enabling incremental refinements and approximate solutions. The authors prove that any LTLf formula in the fragment without Until can be expressed as a Boolean combination of directed formulas, and demonstrate the approach with SCARLET, which outperforms state-of-the-art tools on synthetic benchmarks and maintains robustness under noise. The results show strong practical impact for scalable specification mining and explainable trace classification, though the current method omits the Until operator and nesting of F/G, outlining clear directions for future work.

Abstract

Linear temporal logic (LTL) is a specification language for finite sequences (called traces) widely used in program verification, motion planning in robotics, process mining, and many other areas. We consider the problem of learning LTL formulas for classifying traces; despite a growing interest of the research community, existing solutions suffer from two limitations: they do not scale beyond small formulas, and they may exhaust computational resources without returning any result. We introduce a new algorithm addressing both issues: our algorithm is able to construct formulas an order of magnitude larger than previous methods, and it is anytime, meaning that it in most cases successfully outputs a formula, albeit possibly not of minimal size. We evaluate the performances of our algorithm using an open source implementation against publicly available benchmarks.

Paper Structure

This paper contains 12 sections, 5 theorems, 12 equations, 5 figures, 1 table, 4 algorithms.

Key Result

Theorem 1

Every formula of $\textbf{LTLf}\,(\mathop{\mathrm{\mathbf{F}}}\nolimits,\mathop{\mathrm{\mathbf{X}}}\nolimits,\wedge,\vee)$ is equivalent to a Boolean combination of directed formulas. Equivalently, every formula of $\textbf{LTLf}\,(\mathop{\mathrm{\mathbf{G}}}\nolimits,\mathop{\mathrm{\mathbf{X}}}\

Figures (5)

  • Figure 1: The Boolean set cover problem: the formulas $\varphi_1, \varphi_2$, and $\varphi_3$ satisfy the words encircled in the corresponding area; in this instance $(\varphi_1 \wedge \varphi_2) \vee \varphi_3$ is a separating formula.
  • Figure 2: Comparison of SCARLET, FLIE and SYSLITEL on synthetic benchmarks. In Figure \ref{['subfig:runtime-comp']}, all times are in seconds and 'TO' indicates timeouts. The size of bubbles in the figure indicate the number of samples for each datapoint.
  • Figure 3: Comparison of SCARLET, FLIE and SYSLITEL on existing benchmarks. In Figure \ref{['subfig:time-comp-flie-benchmarks']}, all times are in seconds and 'TO' indicates timeouts. The size of bubbles indicate the number of samples for each datapoint.
  • Figure 4: Comparison of SCARLET, FLIE and SYSLITEL on synthetic benchmarks. In Figures \ref{['subfig:sample-scale']} and \ref{['subfig:length-scale']}, all times are in seconds and 'TO' indicates timeouts.
  • Figure 5: Comparison of SCARLET and MaxSAT on synthetic noisy benchmarks for different noise thresholds. All times are in seconds, and 'TO' indicates timeouts. The scatter plots in Figure \ref{['subfig:noisyruntime-comp']} represent the comparison of time taken to solve each sample by both tools for three different noise levels. The cactus plot in Figure \ref{['subfig:noisy-cactus']} represents the total number of samples solved within a given time-point by both the tools, and steeper lines represent better performance.

Theorems & Definitions (8)

  • Theorem 1
  • Lemma 1
  • proof : Proof of Lemma \ref{['lem:X']}
  • Lemma 2
  • proof : Proof of Lemma \ref{['lem:disjunction']}
  • proof : Proof of Theorem \ref{['thm:fragment-equivalence']}
  • Lemma 3
  • Theorem 2