Tractable Circuits in Database Theory
Antoine Amarilli, Florent Capelli
TL;DR
This paper surveys how tractable circuit classes from knowledge compilation can be leveraged to solve a broad range of database tasks. It shows how Boolean provenance and answer functions for MSO and CQ/UCQ queries can be represented by DNNF/SDNNF circuits under determinism and structuredness, enabling PTIME probabilistic query evaluation, approximate PQE via FPRAS, and efficient enumeration and ranking. It also connects these Boolean representations to multivalued representations via factorized databases, providing circuit-based encodings of query answers and provenance, and describes practical toolchains and algorithmic strategies (DPLL-based and bottom-up compilers). The work highlights two main directions: using circuits to obtain a unified, modular tractability framework across diverse tasks, and employing factorized circuits to succinctly represent large answer sets; it concludes with open questions such as the intensional-extensional conjecture for UCQs, incremental maintenance, and extending circuit methods to new database paradigms.
Abstract
This work reviews how database theory uses tractable circuit classes from knowledge compilation. We present relevant query evaluation tasks, and notions of tractable circuits. We then show how these tractable circuits can be used to address database tasks. We first focus on Boolean provenance and its applications for aggregation tasks, in particular probabilistic query evaluation. We study these for Monadic Second Order (MSO) queries on trees, and for safe Conjunctive Queries (CQs) and Union of Conjunctive Queries (UCQs). We also study circuit representations of query answers, and their applications to enumeration tasks: both in the Boolean setting (for MSO) and the multivalued setting (for CQs and UCQs).
