Table of Contents
Fetching ...

Teaching Temporal Logics to Neural Networks

Christopher Hahn, Frederik Schmitt, Jens U. Kreber, Markus N. Rabe, Bernd Finkbeiner

TL;DR

The paper investigates end-to-end neural learning for symbolic verification by training a Transformer to predict satisfying traces for linear-time temporal logic (LTL) formulas. It uses traces generated by the spot tool as training targets and demonstrates strong semantic understanding, generalizing to much longer formulas and to formulas where the classical solver times out. The results show high semantic accuracy even when syntactic outputs diverge from the training generator, and they extend the approach to propositional logic, suggesting broad applicability of neural-symbolic learning in formal methods. The work highlights the potential for neural predictions to augment, not replace, classical verification pipelines.

Abstract

We study two fundamental questions in neuro-symbolic computing: can deep learning tackle challenging problems in logics end-to-end, and can neural networks learn the semantics of logics. In this work we focus on linear-time temporal logic (LTL), as it is widely used in verification. We train a Transformer on the problem to directly predict a solution, i.e. a trace, to a given LTL formula. The training data is generated with classical solvers, which, however, only provide one of many possible solutions to each formula. We demonstrate that it is sufficient to train on those particular solutions to formulas, and that Transformers can predict solutions even to formulas from benchmarks from the literature on which the classical solver timed out. Transformers also generalize to the semantics of the logics: while they often deviate from the solutions found by the classical solvers, they still predict correct solutions to most formulas.

Teaching Temporal Logics to Neural Networks

TL;DR

The paper investigates end-to-end neural learning for symbolic verification by training a Transformer to predict satisfying traces for linear-time temporal logic (LTL) formulas. It uses traces generated by the spot tool as training targets and demonstrates strong semantic understanding, generalizing to much longer formulas and to formulas where the classical solver times out. The results show high semantic accuracy even when syntactic outputs diverge from the training generator, and they extend the approach to propositional logic, suggesting broad applicability of neural-symbolic learning in formal methods. The work highlights the potential for neural predictions to augment, not replace, classical verification pipelines.

Abstract

We study two fundamental questions in neuro-symbolic computing: can deep learning tackle challenging problems in logics end-to-end, and can neural networks learn the semantics of logics. In this work we focus on linear-time temporal logic (LTL), as it is widely used in verification. We train a Transformer on the problem to directly predict a solution, i.e. a trace, to a given LTL formula. The training data is generated with classical solvers, which, however, only provide one of many possible solutions to each formula. We demonstrate that it is sufficient to train on those particular solutions to formulas, and that Transformers can predict solutions even to formulas from benchmarks from the literature on which the classical solver timed out. Transformers also generalize to the semantics of the logics: while they often deviate from the solutions found by the classical solvers, they still predict correct solutions to most formulas.

Paper Structure

This paper contains 22 sections, 2 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Performance of our best models trained on practical pattern formulas. The x-axis shows the formula size. Syntactic accuracy, i.e., where the Transformer agrees with the generator are displayed in dark green. Instances where the Transformer deviates from the generators output but still provides correct output are displayed in light green; incorrect predictions in orange.
  • Figure 2: Overview of our main experimental results: the performance of our best performing models on our different data sets. The percentage of a dark green bar refers to the syntactic accuracy, the percentage of a light green bar to the semantic accuracy without the syntactic accuracy, and the incorrect predictions are visualized in orange.
  • Figure 3: Predictions of our best performing model, trained on $\mathit{LTLUnsolved254}$, on $5704$ specification patterns for which spot timed out ($> 60s$). Semantic accuracy is displayed in green; incorrect traces in orange; syntactically invalid traces in red.
  • Figure 4: Syntactic and semantic accuracy of our best performing model (only trained on $\mathit{LTLRandom35}$) on $\mathit{LTLRandom50}$. Dark green is syntactically correct; light green is semantically correct, orange is incorrect.
  • Figure 5: Size distributions in the $\mathit{LTLPattern126}$ test set: on the x-axis is the size of the formulas; on the y-axis the number of formulas.
  • ...and 5 more figures