Table of Contents
Fetching ...

GraSS: Combining Graph Neural Networks with Expert Knowledge for SAT Solver Selection

Zhanguang Zhang, Didier Chetelat, Joseph Cotnareanu, Amur Ghose, Wenyi Xiao, Hui-Ling Zhen, Yingxue Zhang, Jianye Hao, Mark Coates, Mingxuan Yuan

TL;DR

GraSS presents a novel graph neural network framework for SAT solver selection by representing CNF instances as literal-clause graphs, enriching nodes with domain-specific features, and adding clause positional encodings to capture input-order effects. The heterogeneous GNN operates on multiple edge types and optimizes a runtime-aware loss to directly minimize solve times, outperforming seven competitive solvers on industrial and competition benchmarks. Ablation studies confirm the value of the tripartite representation, clause encodings, and feature design, while highlighting GraSS's robustness to mis-predictions and its potential to set a new standard for SAT solver selection. This approach enables more efficient solver portfolios in real-world circuit design and large-scale SAT tasks.

Abstract

Boolean satisfiability (SAT) problems are routinely solved by SAT solvers in real-life applications, yet solving time can vary drastically between solvers for the same instance. This has motivated research into machine learning models that can predict, for a given SAT instance, which solver to select among several options. Existing SAT solver selection methods all rely on some hand-picked instance features, which are costly to compute and ignore the structural information in SAT graphs. In this paper we present GraSS, a novel approach for automatic SAT solver selection based on tripartite graph representations of instances and a heterogeneous graph neural network (GNN) model. While GNNs have been previously adopted in other SAT-related tasks, they do not incorporate any domain-specific knowledge and ignore the runtime variation introduced by different clause orders. We enrich the graph representation with domain-specific decisions, such as novel node feature design, positional encodings for clauses in the graph, a GNN architecture tailored to our tripartite graphs and a runtime-sensitive loss function. Through extensive experiments, we demonstrate that this combination of raw representations and domain-specific choices leads to improvements in runtime for a pool of seven state-of-the-art solvers on both an industrial circuit design benchmark, and on instances from the 20-year Anniversary Track of the 2022 SAT Competition.

GraSS: Combining Graph Neural Networks with Expert Knowledge for SAT Solver Selection

TL;DR

GraSS presents a novel graph neural network framework for SAT solver selection by representing CNF instances as literal-clause graphs, enriching nodes with domain-specific features, and adding clause positional encodings to capture input-order effects. The heterogeneous GNN operates on multiple edge types and optimizes a runtime-aware loss to directly minimize solve times, outperforming seven competitive solvers on industrial and competition benchmarks. Ablation studies confirm the value of the tripartite representation, clause encodings, and feature design, while highlighting GraSS's robustness to mis-predictions and its potential to set a new standard for SAT solver selection. This approach enables more efficient solver portfolios in real-world circuit design and large-scale SAT tasks.

Abstract

Boolean satisfiability (SAT) problems are routinely solved by SAT solvers in real-life applications, yet solving time can vary drastically between solvers for the same instance. This has motivated research into machine learning models that can predict, for a given SAT instance, which solver to select among several options. Existing SAT solver selection methods all rely on some hand-picked instance features, which are costly to compute and ignore the structural information in SAT graphs. In this paper we present GraSS, a novel approach for automatic SAT solver selection based on tripartite graph representations of instances and a heterogeneous graph neural network (GNN) model. While GNNs have been previously adopted in other SAT-related tasks, they do not incorporate any domain-specific knowledge and ignore the runtime variation introduced by different clause orders. We enrich the graph representation with domain-specific decisions, such as novel node feature design, positional encodings for clauses in the graph, a GNN architecture tailored to our tripartite graphs and a runtime-sensitive loss function. Through extensive experiments, we demonstrate that this combination of raw representations and domain-specific choices leads to improvements in runtime for a pool of seven state-of-the-art solvers on both an industrial circuit design benchmark, and on instances from the 20-year Anniversary Track of the 2022 SAT Competition.
Paper Structure (32 sections, 5 equations, 4 figures, 7 tables)

This paper contains 32 sections, 5 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: The workflow of our method. SAT instances are represented as literal-clause graphs with hand-designed attributes. Rounds of heterogeneous graph convolutions are applied, which modify the attributes. The attributes of the clause and variable nodes are then averaged, before being fed to a linear layer followed by a softmax over the various solvers. The convolutions and the linear layer are trained to minimize by gradient descent a runtime-sensitive classification loss computed from runtimes collected on training SAT instances.
  • Figure 2: Literal-clause graph representation of a SAT instance used in this work. The instance is converted to CNF form, and nodes represent positive literals, negative literals and clauses. Edges are drawn between clauses and literal nodes if the literal participate in the clause, and edges are also drawn between positive and negative nodes of the same variable. The nodes are endowed with feature vectors described in Appendix \ref{['sec:node-features']}.
  • Figure 3: Runtime variation introduced by permutation. Thirty SAT instances are randomly sampled from SAT Competition data, and the order of variables (left) and clauses (right) are shuffled twenty times. The instances are then solved by Kissat 3.0 with a 5,000s cutoff, and the collected runtimes are plotted in ascending order of mean runtime.
  • Figure 4: Cost of wrong prediction: the runtime difference between predicted solver and the optimal solver, when the selector has made a mistake. Average across five folds are shown, with standard deviation as error bar. Lower is better.