A Contemporary Survey of Large Language Model Assisted Program Analysis
Jiayimei Wang, Tao Ni, Wei-Bin Lee, Qingchuan Zhao
TL;DR
The paper surveys how Large Language Models are applied to program analysis, tackling static, dynamic, and hybrid approaches to vulnerability and malware detection, program verification, and test generation. It highlights representative methods, such as LLift, IRIS, CoqPilot, LaM4Inv, CHEMFuzz, PentestGPT, and AutoSafeCoder, to illustrate how LLMs augment traditional static/dynamic analyses and enable more scalable, explainable reasoning. Major contributions include a comprehensive taxonomy, cross-domain synthesis of LLM-assisted techniques, and a discussion of challenges like token limits, non-determinism, and hallucinations, along with future directions emphasizing deeper integration with formal methods and hybrid analysis frameworks. The work aims to guide security researchers and developers in deploying domain-specific LLMs to improve vulnerability detection, verification, and automated testing in real-world software systems, while outlining practical constraints and research opportunities in next-generation program-analysis tools.
Abstract
The increasing complexity of software systems has driven significant advancements in program analysis, as traditional methods unable to meet the demands of modern software development. To address these limitations, deep learning techniques, particularly Large Language Models (LLMs), have gained attention due to their context-aware capabilities in code comprehension. Recognizing the potential of LLMs, researchers have extensively explored their application in program analysis since their introduction. Despite existing surveys on LLM applications in cybersecurity, comprehensive reviews specifically addressing their role in program analysis remain scarce. In this survey, we systematically review the application of LLMs in program analysis, categorizing the existing work into static analysis, dynamic analysis, and hybrid approaches. Moreover, by examining and synthesizing recent studies, we identify future directions and challenges in the field. This survey aims to demonstrate the potential of LLMs in advancing program analysis practices and offer actionable insights for security researchers seeking to enhance detection frameworks or develop domain-specific models.
