Hardness of Random Reordered Encodings of Parity for Resolution and CDCL
Leroy Chew, Alexis de Colnet, Friedrich Slivovsky, Stefan Szeider
TL;DR
Parity reasoning is notoriously hard for CDCL and resolution when encoded with random variable orders. The authors prove that the formulas $\mathsf{rPar}(n,\sigma)$ and a related family $\mathsf{rAddPar}(a,b)$ are Tseitin formulas whose Tseitin graphs have $tw = \Omega(n)$ with high probability, implying exponential resolution refutations. They also show that DRAT$^-$ and Extended Resolution proofs exist with $O(n\log n)$ lines, while Gaussian-elimination-inspired reasoning captures practical upper bounds. Experiments corroborate the link between treewidth and CDCL solving time, with running time growing exponentially in $tw$, whereas XOR-aware solvers can handle many instances quickly.
Abstract
Parity reasoning is challenging for Conflict-Driven Clause Learning (CDCL) SAT solvers. This has been observed even for simple formulas encoding two contradictory parity constraints with different variable orders (Chew and Heule 2020). We provide an analytical explanation for their hardness by showing that they require exponential resolution refutations with high probability when the variable order is chosen at random. We obtain this result by proving that these formulas, which are known to be Tseitin formulas, have Tseitin graphs of linear treewidth with high probability. Since such Tseitin formulas require exponential resolution proofs, our result follows. We generalize this argument to a new class of formulas that capture a basic form of parity reasoning involving a sum of two random parity constraints with random orders. Even when the variable order for the sum is chosen favorably, these formulas remain hard for resolution. In contrast, we prove that they have short DRAT refutations. We show experimentally that the running time of CDCL SAT solvers on both classes of formulas grows exponentially with their treewidth.
