A faster FPRAS for #NFA

Kuldeep S. Meel; Sourav Chakraborty; Umang Mathur

A faster FPRAS for #NFA

Kuldeep S. Meel, Sourav Chakraborty, Umang Mathur

TL;DR

The paper tackles the #NFA counting problem, seeking a practical FPRAS. It introduces a faster approach that decouples per-state sampling from the automaton size and leverages a layered unrolling of the NFA, distribution entanglement analysis, and a self-reducible union sampler. The main contributions are AppUnion for efficient union-size estimation and a recursive Sampling subroutine, culminating in an overall runtime of $\tilde{O}((m^2 n^{10} + m^3 n^6) \frac{1}{\epsilon^4} \log^2(1/\delta))$ with per-state sampling $\tilde{O}(\frac{n^4}{\epsilon^2})$, independent of $m$. This advances the practical applicability of approximate #NFA tools across domains including probabilistic query evaluation and regular path queries. The methods rely on strong counting-sampling duality and a novel distribution-entanglement framework to ensure tight probabilistic guarantees.

Abstract

Given a non-deterministic finite automaton (NFA) A with m states, and a natural number n (presented in unary), the #NFA problem asks to determine the size of the set L(A_n) of words of length n accepted by A. While the corresponding decision problem of checking the emptiness of L(A_n) is solvable in polynomial time, the #NFA problem is known to be #P-hard. Recently, the long-standing open question -- whether there is an FPRAS (fully polynomial time randomized approximation scheme) for #NFA -- was resolved in \cite{ACJR19}. The FPRAS due to \cite{ACJR19} relies on the interreducibility of counting and sampling, and computes, for each pair of state q and natural number i <= n, a set of O(\frac{m^7 n^7}{epsilon^7}) many uniformly chosen samples from the set of words of length i that have a run ending at q (εis the error tolerance parameter of the FPRAS). This informative measure -- the number of samples maintained per state and length -- also affects the overall time complexity with a quadratic dependence. Given the prohibitively high time complexity, in terms of each of the input parameters, of the FPRAS due to \cite{ACJR19}, and considering the widespread application of approximate counting (and sampling) in various tasks in Computer Science, a natural question arises: Is there a faster FPRAS for #NFA that can pave the way for the practical implementation of approximate #NFA tools? In this work, we demonstrate that significant improvements in time complexity are achievable. Specifically, we have reduced the number of samples required for each state to be independent of m, with significantly less dependence on $n$ and $ε$, maintaining only \widetilde{O}(\frac{n^4}{epsilon^2}) samples per state.

A faster FPRAS for #NFA

TL;DR

with per-state sampling

, independent of

. This advances the practical applicability of approximate #NFA tools across domains including probabilistic query evaluation and regular path queries. The methods rely on strong counting-sampling duality and a novel distribution-entanglement framework to ensure tight probabilistic guarantees.

Abstract

and

, maintaining only \widetilde{O}(\frac{n^4}{epsilon^2}) samples per state.

Paper Structure (18 sections, 5 theorems, 27 equations, 1 figure, 4 algorithms)

This paper contains 18 sections, 5 theorems, 27 equations, 1 figure, 4 algorithms.

Introduction
Overview of our FPRAS
Preliminaries
A New Notation: Distribution Entanglement
A Faster FPRAS
Approximating Union of Sets
Sampling Subroutine
Main Algorithm
Technical Analysis
Correctness of Algorithm \ref{['algo:approx-delphic']}
Correctness of Algorithm \ref{['algo:sampling']}
Correctness of Algorithm \ref{['algo:main']}
Proof of Theorem \ref{['thm:main']}
Conclusions and Future Work
Details from Section \ref{['sec:mainalgo']}
...and 3 more sections

Key Result

Theorem 1

Let $\epsilon, \delta > 0$, and let $\Omega$ be some set. Let $T_1, T_2, \ldots, T_k \subseteq \Omega$ be sets with membership oracles $O_1, \ldots, O_k$ respectively. Let $\epsilon_\textsf{sz} \geq 0$ and let $\textsf{sz}_1, \ldots, \textsf{sz}_k \in \mathbb{N}$ be such that for every $i \leq k$, w The algorithm makes $O\left(k\cdot (1+\epsilon_{\textsf{sz}})^2 \cdot \frac{1}{\epsilon^2} \cdot \l

Figures (1)

Figure 1: Algorithm Template: FPRAS for #NFA

Theorems & Definitions (10)

Theorem 1
Theorem 2
Theorem 3
Lemma 4
Lemma 5
Claim 5
Claim 6
Claim 6
Claim 6
Claim 6

A faster FPRAS for #NFA

TL;DR

Abstract

A faster FPRAS for #NFA

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (10)