Table of Contents
Fetching ...

A faster FPRAS for #NFA

Kuldeep S. Meel, Sourav Chakraborty, Umang Mathur

TL;DR

The paper tackles the #NFA counting problem, seeking a practical FPRAS. It introduces a faster approach that decouples per-state sampling from the automaton size and leverages a layered unrolling of the NFA, distribution entanglement analysis, and a self-reducible union sampler. The main contributions are AppUnion for efficient union-size estimation and a recursive Sampling subroutine, culminating in an overall runtime of $\tilde{O}((m^2 n^{10} + m^3 n^6) \frac{1}{\epsilon^4} \log^2(1/\delta))$ with per-state sampling $\tilde{O}(\frac{n^4}{\epsilon^2})$, independent of $m$. This advances the practical applicability of approximate #NFA tools across domains including probabilistic query evaluation and regular path queries. The methods rely on strong counting-sampling duality and a novel distribution-entanglement framework to ensure tight probabilistic guarantees.

Abstract

Given a non-deterministic finite automaton (NFA) A with m states, and a natural number n (presented in unary), the #NFA problem asks to determine the size of the set L(A_n) of words of length n accepted by A. While the corresponding decision problem of checking the emptiness of L(A_n) is solvable in polynomial time, the #NFA problem is known to be #P-hard. Recently, the long-standing open question -- whether there is an FPRAS (fully polynomial time randomized approximation scheme) for #NFA -- was resolved in \cite{ACJR19}. The FPRAS due to \cite{ACJR19} relies on the interreducibility of counting and sampling, and computes, for each pair of state q and natural number i <= n, a set of O(\frac{m^7 n^7}{epsilon^7}) many uniformly chosen samples from the set of words of length i that have a run ending at q (εis the error tolerance parameter of the FPRAS). This informative measure -- the number of samples maintained per state and length -- also affects the overall time complexity with a quadratic dependence. Given the prohibitively high time complexity, in terms of each of the input parameters, of the FPRAS due to \cite{ACJR19}, and considering the widespread application of approximate counting (and sampling) in various tasks in Computer Science, a natural question arises: Is there a faster FPRAS for #NFA that can pave the way for the practical implementation of approximate #NFA tools? In this work, we demonstrate that significant improvements in time complexity are achievable. Specifically, we have reduced the number of samples required for each state to be independent of m, with significantly less dependence on $n$ and $ε$, maintaining only \widetilde{O}(\frac{n^4}{epsilon^2}) samples per state.

A faster FPRAS for #NFA

TL;DR

The paper tackles the #NFA counting problem, seeking a practical FPRAS. It introduces a faster approach that decouples per-state sampling from the automaton size and leverages a layered unrolling of the NFA, distribution entanglement analysis, and a self-reducible union sampler. The main contributions are AppUnion for efficient union-size estimation and a recursive Sampling subroutine, culminating in an overall runtime of with per-state sampling , independent of . This advances the practical applicability of approximate #NFA tools across domains including probabilistic query evaluation and regular path queries. The methods rely on strong counting-sampling duality and a novel distribution-entanglement framework to ensure tight probabilistic guarantees.

Abstract

Given a non-deterministic finite automaton (NFA) A with m states, and a natural number n (presented in unary), the #NFA problem asks to determine the size of the set L(A_n) of words of length n accepted by A. While the corresponding decision problem of checking the emptiness of L(A_n) is solvable in polynomial time, the #NFA problem is known to be #P-hard. Recently, the long-standing open question -- whether there is an FPRAS (fully polynomial time randomized approximation scheme) for #NFA -- was resolved in \cite{ACJR19}. The FPRAS due to \cite{ACJR19} relies on the interreducibility of counting and sampling, and computes, for each pair of state q and natural number i <= n, a set of O(\frac{m^7 n^7}{epsilon^7}) many uniformly chosen samples from the set of words of length i that have a run ending at q (εis the error tolerance parameter of the FPRAS). This informative measure -- the number of samples maintained per state and length -- also affects the overall time complexity with a quadratic dependence. Given the prohibitively high time complexity, in terms of each of the input parameters, of the FPRAS due to \cite{ACJR19}, and considering the widespread application of approximate counting (and sampling) in various tasks in Computer Science, a natural question arises: Is there a faster FPRAS for #NFA that can pave the way for the practical implementation of approximate #NFA tools? In this work, we demonstrate that significant improvements in time complexity are achievable. Specifically, we have reduced the number of samples required for each state to be independent of m, with significantly less dependence on and , maintaining only \widetilde{O}(\frac{n^4}{epsilon^2}) samples per state.
Paper Structure (18 sections, 5 theorems, 27 equations, 1 figure, 4 algorithms)

This paper contains 18 sections, 5 theorems, 27 equations, 1 figure, 4 algorithms.

Key Result

Theorem 1

Let $\epsilon, \delta > 0$, and let $\Omega$ be some set. Let $T_1, T_2, \ldots, T_k \subseteq \Omega$ be sets with membership oracles $O_1, \ldots, O_k$ respectively. Let $\epsilon_\textsf{sz} \geq 0$ and let $\textsf{sz}_1, \ldots, \textsf{sz}_k \in \mathbb{N}$ be such that for every $i \leq k$, w The algorithm makes $O\left(k\cdot (1+\epsilon_{\textsf{sz}})^2 \cdot \frac{1}{\epsilon^2} \cdot \l

Figures (1)

  • Figure 1: Algorithm Template: FPRAS for #NFA

Theorems & Definitions (10)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma 4
  • Lemma 5
  • Claim 5
  • Claim 6
  • Claim 6
  • Claim 6
  • Claim 6