Table of Contents
Fetching ...

POLIMON: Checking Temporal Properties over Out-of-order Streams at Runtime

Felix Klaedtke

TL;DR

POLÍMON tackles runtime verification for distributed systems where events may arrive out of order, checking specifications written in $MTL$ and $MTL$^\downarrow$ (MTL with the freeze quantifier) against streaming observations. It introduces a three-valued, knowledge-gap–aware semantics over observations and a graph-based, pipelined monitor that updates verdicts as soon as information becomes available, using wall-clock time rather than a fully ordered timestamp. The key contributions include a scalable, multi-core Go implementation with state export/import capabilities and explicit gap-closing mechanisms, a rigorous soundness/completeness framework, and an extensive experimental evaluation showing favorable performance relative to prior work and clarifying the trade-offs when data values are involved. The approach enables real-time verification in unreliable, distributed environments (e.g., SDN or microservice architectures) and highlights practical limits and directions for improving throughput for data-rich specifications.

Abstract

This paper presents the monitoring tool POLIMON for checking system behavior at runtime against specifications expressed as formulas in the real-time logic MTL or its extension with the freeze quantifier. The tool's distinguishing feature is that POLIMON can receive messages describing the system events out of order. Furthermore, since POLIMON processes received messages immediately, it outputs verdicts promptly when a message's described system event leads to a violation of the specification. This makes the tool well suited, e.g., for verifying the behavior of distributed systems with unreliable channels at runtime.

POLIMON: Checking Temporal Properties over Out-of-order Streams at Runtime

TL;DR

POLÍMON tackles runtime verification for distributed systems where events may arrive out of order, checking specifications written in and ^\downarrow$ (MTL with the freeze quantifier) against streaming observations. It introduces a three-valued, knowledge-gap–aware semantics over observations and a graph-based, pipelined monitor that updates verdicts as soon as information becomes available, using wall-clock time rather than a fully ordered timestamp. The key contributions include a scalable, multi-core Go implementation with state export/import capabilities and explicit gap-closing mechanisms, a rigorous soundness/completeness framework, and an extensive experimental evaluation showing favorable performance relative to prior work and clarifying the trade-offs when data values are involved. The approach enables real-time verification in unreliable, distributed environments (e.g., SDN or microservice architectures) and highlights practical limits and directions for improving throughput for data-rich specifications.

Abstract

This paper presents the monitoring tool POLIMON for checking system behavior at runtime against specifications expressed as formulas in the real-time logic MTL or its extension with the freeze quantifier. The tool's distinguishing feature is that POLIMON can receive messages describing the system events out of order. Furthermore, since POLIMON processes received messages immediately, it outputs verdicts promptly when a message's described system event leads to a violation of the specification. This makes the tool well suited, e.g., for verifying the behavior of distributed systems with unreliable channels at runtime.
Paper Structure (11 sections, 9 equations, 6 figures)

This paper contains 11 sections, 9 equations, 6 figures.

Figures (6)

  • Figure 1: Tool overview.
  • Figure 2: Graph-based monitoring data structure.
  • Figure 3: Running times for the Timescales pattern family past-always comprising three formulas on ordered event streams with $10^6$ time points (left) and running times for the pattern past-always-after on out-of-order event streams with $10^5$ time points (right).
  • Figure 4: Formulas used in the experimental evaluation.
  • Figure 5: Running times over event streams covering a time span of 60 seconds. Each data point shows the average over five event streams together with the minimum and maximum, which are often very close to the average. Note that for MTL, the event rates are 10 times higher than for MTL$^\downarrow$.
  • ...and 1 more figures