Table of Contents
Fetching ...

Counting and Sampling Traces in Regular Languages

Alexis de Colnet, Kuldeep S. Meel, Umang Mathur

TL;DR

The paper tackles the problem of counting and sampling Mazurkiewicz traces touched by a regular language, under a fixed independence relation. It proves #P-hardness for exact trace counting even for DFAs and provides an FPRAS and an FPAUS whose runtimes depend on the alphabet width ω. The authors introduce canonical runs, a prefix-validator automaton, and an S-process to manage dependencies, enabling principled approximate counting and near-uniform trace sampling. These results enable more informed bounded model checking and trace-aware testing by quantifying reduced state spaces and enabling unbiased exploration.

Abstract

In this work, we study the problems of counting and sampling Mazurkiewicz traces that a regular language touches. Fix an alphabet $Σ$ and an independence relation $\mathbb{I} \subseteq Σ\times Σ$. The input consists of a regular language $L \subseteq Σ^*$, given by a finite automaton with $m$ states, and a natural number $n$ (in unary). For the counting problem, the goal is to compute the number of Mazurkiewicz traces (induced by $\mathbb{I}$) that intersect the $n^\text{th}$ slice $L_n = L \cap Σ^n$, i.e., traces that admit at least one linearization in $L_n$. For the sampling problem, the goal is to output a trace drawn from a distribution that is approximately uniform over all such traces. These tasks are motivated by bounded model checking with partial-order reduction, where an \emph{a priori} estimate of the reduced state space is valuable, and by testing methods for concurrent programs that use partial-order-aware random exploration. We first show that the counting problem is #P-hard even when $L$ is accepted by a deterministic automaton, in sharp contrast to counting words of a DFA, which is polynomial-time solvable. We then prove that the problem lies in #P for both NFAs and DFAs, irrespective of whether $L$ is trace-closed. Our main algorithmic contributions are a \emph{fully polynomial-time randomized approximation scheme} (FPRAS) that, with high probability, approximates the desired count within a prescribed accuracy, and a \emph{fully polynomial-time almost uniform sampler} (FPAUS) that generates traces whose distribution is provably close to uniform.

Counting and Sampling Traces in Regular Languages

TL;DR

The paper tackles the problem of counting and sampling Mazurkiewicz traces touched by a regular language, under a fixed independence relation. It proves #P-hardness for exact trace counting even for DFAs and provides an FPRAS and an FPAUS whose runtimes depend on the alphabet width ω. The authors introduce canonical runs, a prefix-validator automaton, and an S-process to manage dependencies, enabling principled approximate counting and near-uniform trace sampling. These results enable more informed bounded model checking and trace-aware testing by quantifying reduced state spaces and enabling unbiased exploration.

Abstract

In this work, we study the problems of counting and sampling Mazurkiewicz traces that a regular language touches. Fix an alphabet and an independence relation . The input consists of a regular language , given by a finite automaton with states, and a natural number (in unary). For the counting problem, the goal is to compute the number of Mazurkiewicz traces (induced by ) that intersect the slice , i.e., traces that admit at least one linearization in . For the sampling problem, the goal is to output a trace drawn from a distribution that is approximately uniform over all such traces. These tasks are motivated by bounded model checking with partial-order reduction, where an \emph{a priori} estimate of the reduced state space is valuable, and by testing methods for concurrent programs that use partial-order-aware random exploration. We first show that the counting problem is #P-hard even when is accepted by a deterministic automaton, in sharp contrast to counting words of a DFA, which is polynomial-time solvable. We then prove that the problem lies in #P for both NFAs and DFAs, irrespective of whether is trace-closed. Our main algorithmic contributions are a \emph{fully polynomial-time randomized approximation scheme} (FPRAS) that, with high probability, approximates the desired count within a prescribed accuracy, and a \emph{fully polynomial-time almost uniform sampler} (FPAUS) that generates traces whose distribution is provably close to uniform.

Paper Structure

This paper contains 21 sections, 39 theorems, 48 equations, 3 figures, 8 algorithms.

Key Result

Proposition 2.1

Given an NFA $\mathcal{A} = (\mathcal{Q},\Sigma,\mathcal{T},q_{\mathsf{I}},F)$ and a positive integer $n$, the unrolled automaton $\mathcal{A}^\textsf{u} = (\mathcal{Q}^\textsf{u},\Sigma,\mathcal{T}^\textsf{u},q_{\mathsf{I}}^\textsf{u},F^\textsf{u})$ can be constructed in time $O(n|\mathcal{T}|)$.

Figures (3)

  • Figure 1: Modeling a program using its control flow automaton
  • Figure 2: Automaton for Example 2
  • Figure 3: Divergence states

Theorems & Definitions (49)

  • Proposition 2.1
  • Lemma 2.1
  • Proposition 2.2: diekert1995book
  • Proposition 2.3
  • Theorem 2.4: Bertoni1989
  • Theorem 3.1
  • Theorem 3.2
  • Definition 4.1: DAG-prefix
  • Definition 4.2: Border
  • Definition 4.3: $u$-Prefix and $u$-Residual
  • ...and 39 more