TurboSAT: Gradient-Guided Boolean Satisfiability Accelerated on GPU-CPU Hybrid System
Steve Dai, Cunxi Yu, Kalyan Krishnamani, Brucek Khailany
TL;DR
This work tackles accelerating Boolean SAT solving on heterogeneous GPU-CPU systems by reformulating SAT as a differentiable binarized matrix-multiplication task. It introduces a binary-minary matrix encoding with $P \in \{0,1\}^{C \times 2V}$ and $A \in \{0,1\}^{2V \times N}$, producing $R=PA$ and a differentiable $S$ via a soft minimum to drive a loss $\mathcal{L} = -\sum_i S_i$, with gradients flowing through a STE-enabled binarization of $A_{real}$. The solver runs a gradient-guided exploration on the GPU to propose promising partial assignments and then uses CDCL-based refinement on the CPU to complete solutions, seeded by high-confidence variables extracted from gradient magnitudes. Empirically, on satisfiable benchmarks from the 2024 SAT Competition, the approach achieves substantial speedups over CPU-only solvers (up to over $200\times$) and an average speedup of about $27.3\times$, validating the viability and value of GPU-accelerated differentiable SAT on hybrid hardware.
Abstract
While accelerated computing has transformed many domains of computing, its impact on logical reasoning, specifically Boolean satisfiability (SAT), remains limited. State-of-the-art SAT solvers rely heavily on inherently sequential conflict-driven search algorithms that offer powerful heuristics but limit the amount of parallelism that could otherwise enable significantly more scalable SAT solving. Inspired by neural network training, we formulate the SAT problem as a binarized matrix-matrix multiplication layer that could be optimized using a differentiable objective function. Enabled by this encoding, we combine the strengths of parallel differentiable optimization and sequential search to accelerate SAT on a hybrid GPU-CPU system. In this system, the GPUs leverage parallel differentiable solving to rapidly evaluate SAT clauses and use gradients to stochastically explore the solution space and optimize variable assignments. Promising partial assignments generated by the GPUs are post-processed on many CPU threads which exploit conflict-driven sequential search to further traverse the solution subspaces and identify complete assignments. Prototyping the hybrid solver on an NVIDIA DGX GB200 node, our solver achieves runtime speedups up to over 200x when compared to a state-of-the-art CPU-based solver on public satisfiable benchmark problems from the SAT Competition.
