Table of Contents
Fetching ...

Neural Approaches to SAT Solving: Design Choices and Interpretability

David Mojžíšek, Jan Hůla, Ziwei Li, Ziyu Zhou, Mikoláš Janota

TL;DR

This work examines neural SAT solving with graph neural networks, focusing on how design choices affect generalization to unseen instances. It shows that a variable-clause graph (VCG) with recurrent updates, combined with a closest assignment training objective, achieves strong SAT assignment accuracy and scalable inference via extra iterations or resampling. The authors extend the base model with a diffusion-based assignment generator, connecting the approach to continuous relaxation views of MaxSAT and enabling effective test-time adaptation, especially when interleaved with unit propagation. Analyses of embedding spaces and optimization trajectories reveal that the GNNimplicitly performs a continuous relaxation-like optimization, offering an interpretable lens on neural reasoning for combinatorial tasks and guiding future hybrid solver designs.

Abstract

In this contribution, we provide a comprehensive evaluation of graph neural networks applied to Boolean satisfiability problems, accompanied by an intuitive explanation of the mechanisms enabling the model to generalize to different instances. We introduce several training improvements, particularly a novel closest assignment supervision method that dynamically adapts to the model's current state, significantly enhancing performance on problems with larger solution spaces. Our experiments demonstrate the suitability of variable-clause graph representations with recurrent neural network updates, which achieve good accuracy on SAT assignment prediction while reducing computational demands. We extend the base graph neural network into a diffusion model that facilitates incremental sampling and can be effectively combined with classical techniques like unit propagation. Through analysis of embedding space patterns and optimization trajectories, we show how these networks implicitly perform a process very similar to continuous relaxations of MaxSAT, offering an interpretable view of their reasoning process. This understanding guides our design choices and explains the ability of recurrent architectures to scale effectively at inference time beyond their training distribution, which we demonstrate with test-time scaling experiments.

Neural Approaches to SAT Solving: Design Choices and Interpretability

TL;DR

This work examines neural SAT solving with graph neural networks, focusing on how design choices affect generalization to unseen instances. It shows that a variable-clause graph (VCG) with recurrent updates, combined with a closest assignment training objective, achieves strong SAT assignment accuracy and scalable inference via extra iterations or resampling. The authors extend the base model with a diffusion-based assignment generator, connecting the approach to continuous relaxation views of MaxSAT and enabling effective test-time adaptation, especially when interleaved with unit propagation. Analyses of embedding spaces and optimization trajectories reveal that the GNNimplicitly performs a continuous relaxation-like optimization, offering an interpretable lens on neural reasoning for combinatorial tasks and guiding future hybrid solver designs.

Abstract

In this contribution, we provide a comprehensive evaluation of graph neural networks applied to Boolean satisfiability problems, accompanied by an intuitive explanation of the mechanisms enabling the model to generalize to different instances. We introduce several training improvements, particularly a novel closest assignment supervision method that dynamically adapts to the model's current state, significantly enhancing performance on problems with larger solution spaces. Our experiments demonstrate the suitability of variable-clause graph representations with recurrent neural network updates, which achieve good accuracy on SAT assignment prediction while reducing computational demands. We extend the base graph neural network into a diffusion model that facilitates incremental sampling and can be effectively combined with classical techniques like unit propagation. Through analysis of embedding space patterns and optimization trajectories, we show how these networks implicitly perform a process very similar to continuous relaxations of MaxSAT, offering an interpretable view of their reasoning process. This understanding guides our design choices and explains the ability of recurrent architectures to scale effectively at inference time beyond their training distribution, which we demonstrate with test-time scaling experiments.

Paper Structure

This paper contains 67 sections, 17 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: LCG and VCG of the CNF formula $(\overline{x}_1 \lor x_2) \land (x_2 \lor \overline{x}_3) \land (x_1 \lor x_3)$.
  • Figure 2: Percentage of SAT instances solved as message passing iterations increase for a model trained on SR40 with SAT+UNSAT closest assignment supervision. Left: Performance on SR100, showing rapid initial improvement. Right: Comparison across benchmarks, demonstrating effectiveness decreases with problem size but benefits from additional iterations, highlighting the recurrent architecture's inference-time scaling capability.
  • Figure 3: Average gap (unsatisfied clauses) reduction with increasing message passing iterations for a model trained on SR40 with SAT+UNSAT closest assignment supervision. Left: Comparison across benchmarks showing extremely rapid gap reduction in early iterations for all problem sizes, with all benchmarks achieving remarkably low average gaps despite varying SAT-solving performance. Right: Individual instance trajectories revealing different convergence patterns between SAT and UNSAT instances, with occasional fluctuations suggesting potential benefit from monitoring solution quality during inference.
  • Figure 4: Performance heatmaps for a model trained on SR40 with SAT+UNSAT closest assignment supervision, showing how metrics improve with both increased iterations (columns) and resampling attempts (rows). Testing on SR40 (top), SR100 (middle), and 3SAT100 (bottom) demonstrates significant gains from both scaling dimensions—e.g., SR40 decision accuracy improves from 84% (1 sample, 25 iterations) to 93% (5 samples, 125 iterations). This two-dimensional inference-time scaling capability is consistent across benchmarks but with decreasing returns on larger problems.
  • Figure 5: Evolution of variable embeddings during message passing iterations for a satisfiable SR40 instance. The visualization shows 2D projections at different stages (Initial through Iteration 25), colored k-means algorithm in each iteration (green/red). Initially random, embeddings gradually organize into two distinct clusters often corresponding to optimal variable assignments. This clustering behavior was observed across different model architectures and training objectives—notably, even models trained solely for SAT/UNSAT classification (without explicit assignment supervision) develop this embedding separation. This phenomenon supports our interpretation that GNNs implicitly perform continuous optimization similar to SDP relaxation for SAT problems.
  • ...and 2 more figures