Table of Contents
Fetching ...

PatUntrack: Automated Generating Patch Examples for Issue Reports without Tracked Insecure Code

Ziyou Jiang, Lin Shi, Guowei Yang, Qing Wang

TL;DR

PatUntrack tackles the challenge of generating patch examples from IRs that lack tracked insecure code by using an auto-prompted LLM pipeline to (i) extract and complete a Vulnerability Triggering Path (VTP) from IR text, (ii) correct hallucinations via external golden knowledge (VulCoK), and (iii) jointly generate Top-$K$ insecure code and patch examples guided by patch-type prediction. Experiments on 5,465 vulnerable IRs show strong improvements over baselines in both insecure code and patch-generation tasks, with substantial practical validation from authors of newly disclosed IRs. The approach demonstrates that IR textual descriptions can be effectively leveraged to produce actionable patch examples, potentially accelerating vulnerability patching in OSS. The work contributes a complete data-and-code release and lays the groundwork for extending VTP-based guidance to broader automatic patch-generation tools.

Abstract

Security patches are essential for enhancing the stability and robustness of projects in the software community. While vulnerabilities are officially expected to be patched before being disclosed, patching vulnerabilities is complicated and remains a struggle for many organizations. To patch vulnerabilities, security practitioners typically track vulnerable issue reports (IRs), and analyze their relevant insecure code to generate potential patches. However, the relevant insecure code may not be explicitly specified and practitioners cannot track the insecure code in the repositories, thus limiting their ability to generate patches. In such cases, providing examples of insecure code and the corresponding patches would benefit the security developers to better locate and fix the insecure code. In this paper, we propose PatUntrack to automatically generating patch examples from IRs without tracked insecure code. It auto-prompts Large Language Models (LLMs) to make them applicable to analyze the vulnerabilities. It first generates the completed description of the Vulnerability-Triggering Path (VTP) from vulnerable IRs. Then, it corrects hallucinations in the VTP description with external golden knowledge. Finally, it generates Top-K pairs of Insecure Code and Patch Example based on the corrected VTP description. To evaluate the performance, we conducted experiments on 5,465 vulnerable IRs. The experimental results show that PatUntrack can obtain the highest performance and improve the traditional LLM baselines by +14.6% (Fix@10) on average in patch example generation. Furthermore, PatUntrack was applied to generate patch examples for 76 newly disclosed vulnerable IRs. 27 out of 37 replies from the authors of these IRs confirmed the usefulness of the patch examples generated by PatUntrack, indicating that they can benefit from these examples for patching the vulnerabilities.

PatUntrack: Automated Generating Patch Examples for Issue Reports without Tracked Insecure Code

TL;DR

PatUntrack tackles the challenge of generating patch examples from IRs that lack tracked insecure code by using an auto-prompted LLM pipeline to (i) extract and complete a Vulnerability Triggering Path (VTP) from IR text, (ii) correct hallucinations via external golden knowledge (VulCoK), and (iii) jointly generate Top- insecure code and patch examples guided by patch-type prediction. Experiments on 5,465 vulnerable IRs show strong improvements over baselines in both insecure code and patch-generation tasks, with substantial practical validation from authors of newly disclosed IRs. The approach demonstrates that IR textual descriptions can be effectively leveraged to produce actionable patch examples, potentially accelerating vulnerability patching in OSS. The work contributes a complete data-and-code release and lays the groundwork for extending VTP-based guidance to broader automatic patch-generation tools.

Abstract

Security patches are essential for enhancing the stability and robustness of projects in the software community. While vulnerabilities are officially expected to be patched before being disclosed, patching vulnerabilities is complicated and remains a struggle for many organizations. To patch vulnerabilities, security practitioners typically track vulnerable issue reports (IRs), and analyze their relevant insecure code to generate potential patches. However, the relevant insecure code may not be explicitly specified and practitioners cannot track the insecure code in the repositories, thus limiting their ability to generate patches. In such cases, providing examples of insecure code and the corresponding patches would benefit the security developers to better locate and fix the insecure code. In this paper, we propose PatUntrack to automatically generating patch examples from IRs without tracked insecure code. It auto-prompts Large Language Models (LLMs) to make them applicable to analyze the vulnerabilities. It first generates the completed description of the Vulnerability-Triggering Path (VTP) from vulnerable IRs. Then, it corrects hallucinations in the VTP description with external golden knowledge. Finally, it generates Top-K pairs of Insecure Code and Patch Example based on the corrected VTP description. To evaluate the performance, we conducted experiments on 5,465 vulnerable IRs. The experimental results show that PatUntrack can obtain the highest performance and improve the traditional LLM baselines by +14.6% (Fix@10) on average in patch example generation. Furthermore, PatUntrack was applied to generate patch examples for 76 newly disclosed vulnerable IRs. 27 out of 37 replies from the authors of these IRs confirmed the usefulness of the patch examples generated by PatUntrack, indicating that they can benefit from these examples for patching the vulnerabilities.
Paper Structure (33 sections, 4 equations, 9 figures, 7 tables, 1 algorithm)

This paper contains 33 sections, 4 equations, 9 figures, 7 tables, 1 algorithm.

Figures (9)

  • Figure 1: The preliminary study to analyze the time cost of patching and the exploited ratio of the vulnerable IRs.
  • Figure 2: The vulnerable IR and the generated insecure code & patch example (skoranga/node-dns-sync/issues/1motivation_example_ir).
  • Figure 3: The structure of PatUntrack.
  • Figure 4: The prompt format $P_0$ in the PatUntrack.
  • Figure 5: The logic flow of VulCoK.
  • ...and 4 more figures