Efficient Heuristics and Exact Methods for Pairwise Interaction Sampling
Sándor P. Fekete, Phillip Keldenich, Dominik Krupke, Michael Perk
TL;DR
This work studies pairwise interaction sampling in large configurable software spaces, formulating the $t$-wise interaction sampling problem ($t$-Isp) on CNF formulas and focusing on $t=2$. It proves BH-hardness and places $t$-Isp in $\P^{\NP[\log]}$ with logarithmically many Sat calls, while also presenting practical methods that scale to hundreds of millions of feasible interactions. The core contributions include an efficient initial heuristic, a suite of lower-bound tools (including a cut & price approach), and Sammy, a parallel Large Neighborhood Search framework that combines a core SAT model with multiple incremental strategies to compute provably optimal samples for large benchmarks, outperforming prior methods on several large instances. Extensive experiments demonstrate significant improvements in both solution quality and runtime, including provable optimality on the four largest PSPL Scalability Challenge instances and notable gains over SampLNS on a broad benchmark, highlighting the practical impact for software testing in highly configurable systems. The work also provides a detailed preprocessing and implementation toolkit (universe reduction, universe pruning, and advanced LNS components) to enable scalable exact and near-exact solving of $2$-Isp in realistic settings.
Abstract
We consider a class of optimization problems that are fundamental to testing in modern configurable software systems, e.g., in automotive industries. In pairwise interaction sampling, we are given a (potentially very large) configuration space, in which each dimension corresponds to a possible Boolean feature of a software system; valid configurations are the satisfying assignments of a given propositional formula $\varphi$. The objective is to find a minimum-sized family of configurations, such that each pair of features is jointly tested at least once. Due to its relevance in Software Engineering, this problem has been studied extensively for over 20 years. In addition to new theoretical insights (we prove BH-hardness), we provide a broad spectrum of key contributions on the practical side that allow substantial progress for the practical performance. Remarkably, we are able to solve the largest instances we found in published benchmark sets (with about 500000000 feasible interactions) to provable optimality. Previous approaches were not even able to compute feasible solutions.
