Towards Comprehensive Sampling of SMT Solutions
Shuangyu Lyu, Chuan Luo, Ruizhi Shi, Wei Wu, Chanjuan Liu, Chunming Hu
TL;DR
PanSampler addresses the SMT sampling problem for BV, ABV, and AUFBV by targeting high AST-coverage with a small solution set. It introduces three innovations—diversity-aware SMT, AST-guided scoring, and post-sampling optimization—within an iterative framework that combines a lazy SMT solver and local search. Empirical results on large- and medium-scale benchmarks show PanSampler achieves higher coverage with fewer solutions and greater efficiency than state-of-the-art competitors, and extends to real-world software testing with improved fault-detection rates and reduced test cases. The work advances SMT sampling by directly optimizing coverage efficiency, yielding practical benefits for software testing and hardware verification across challenging, large-scale formulas.
Abstract
This work focuses on effectively generating diverse solutions for satisfiability modulo theories (SMT) formulas, targeting the theories of bit-vectors, arrays, and uninterpreted functions, which is a critical task in software and hardware testing. Generating diverse SMT solutions helps uncover faults and detect safety violations during the verification and testing process, resulting in the SMT sampling problem, i.e., constructing a small number of solutions while achieving comprehensive coverage of the constraint space. While high coverage is crucial for exploring system behaviors, reducing the number of solutions is of great importance, as excessive solutions increase testing time and resource usage, undermining efficiency. In this work, we introduce PanSampler, a novel SMT sampler that achieves high coverage with a small number of solutions. It incorporates three novel techniques, i.e., diversity-aware SMT algorithm, abstract syntax tree (AST)-guided scoring function and post-sampling optimization technology, enhancing its practical performance. It iteratively samples solutions, evaluates candidates, and employs local search to refine solutions, ensuring high coverage with a small number of samples. Extensive experiments on practical benchmarks demonstrate that PanSampler exhibits a significantly stronger capability to reach high target coverage, while requiring fewer solutions than current samplers to achieve the same coverage level. Furthermore, our empirical evaluation on practical subjects, which are collected from real-world software systems, shows that PanSampler achieves higher fault detection capability and reduces the number of required test cases from 32.6\% to 76.4\% to reach the same fault detection effectiveness, leading to a substantial improvement in testing efficiency. PanSampler advances SMT sampling, reducing the cost of software testing and hardware verification.
