PALM: Path-aware LLM-based Test Generation with Comprehension
Yaoxuan Wu, Xiaojie Zhou, Ahmad Humayun, Muhammad Ali Gulzar, Miryung Kim
TL;DR
PALM addresses the gap between symbolic path enumeration and LLM-based test generation by constructing path-specific executable variants with embedded assertions and using LLMs to generate targeted inputs. It combines AST-level path extraction, function inlining, variable renaming, and constant propagation to create self-contained path variants, then validates LLM-generated tests by runtime execution on those variants. An interactive frontend visualizes the symbolic execution tree and path coverage, enabling users to inspect and refine tests for specific paths. Evaluations on 124 HumanEval-Java programs show PALM achieves substantial gains in path coverage over direct LLM prompting, while outperforming Symbolic PathFinder in scenarios with external API calls; a within-subject user study indicates PALM improves users' understanding of coverage and path-to-test alignment. The work demonstrates that integrating symbolic path enumeration with LLM-driven test generation and interactive visualization can enhance path-aware testing and reduce coverage gaps caused by API modeling limits.
Abstract
Symbolic execution is a widely used technique for test generation, offering systematic exploration of program paths through constraint solving. However, it is fundamentally constrained by the capability to model the target code, including library functions, in terms of symbolic constraints and by the capability of underlying constraint solvers. As a result, many paths involving complex features remain unanalyzed or insufficiently modeled. Recent advances in large language models (LLMs) have shown promise in generating diverse and valid test inputs. Yet, LLMs lack mechanisms for systematically enumerating program paths and often fail to cover subtle corner cases. We observe that directly prompting an LLM with the full program leads to missed coverage of interesting paths. In this paper, we present PALM, a test generation system that combines symbolic path enumeration with LLM-assisted test generation. PALM statically enumerates possible paths through AST-level analysis and transforms each into an executable variant with embedded assertions that specify the target path. This avoids the need to translate path constraints into SMT formulas, by instead constructing program variants that the LLM can interpret. Importantly, PALM provides an interactive frontend that visualizes path coverage alongside generated tests, assembling tests based on the specific paths they exercise. A user study with 12 participants demonstrates that PALM's frontend helps users better understand path coverage and identify which paths are actually exercised by PALM-generated tests through verification and visualization of their path profiles.
