Engineering an Efficient Approximate DNF-Counter
Mate Soos, Uddalok Sarkar, Divesh Aggarwal, Sourav Chakraborty, Kuldeep S. Meel, Maciej Obremski
TL;DR
This work addresses the challenging problem of approximately counting solutions to DNF formulas (#DNF), a #P-complete task. The authors introduce pepin, a practically efficient FPRAS that replaces the theoretical Binomial-based sampling of prior streaming approaches with a Poisson-based, lazy sampling scheme, augmented by engineering optimizations. They prove correctness with Chernoff-type guarantees and derive a tight time bound, while demonstrating in extensive experiments that pepin achieves up to 40x faster runtimes than previous state-of-the-art methods and substantially lower observed error than expected. The results have strong practical implications for probabilistic databases and network reliability analyses, enabling scalable, reliable volume estimation in large DNFs.
Abstract
Model counting is a fundamental problem in many practical applications, including query evaluation in probabilistic databases and failure-probability estimation of networks. In this work, we focus on a variant of this problem where the underlying formula is expressed in the Disjunctive Normal Form (DNF), also known as #DNF. This problem has been shown to be #P-complete, making it often intractable to solve exactly. Much research has therefore focused on obtaining approximate solutions, particularly in the form of $(\varepsilon, δ)$ approximations. The primary contribution of this paper is a new approach, called pepin, an approximate #DNF counter that significantly outperforms prior state-of-the-art approaches. Our work is based on the recent breakthrough in the context of the union of sets in the streaming model. We demonstrate the effectiveness of our approach through extensive experiments and show that it provides an affirmative answer to the challenge of efficiently computing #DNF.
