PredicateFix: Repairing Static Analysis Alerts with Bridging Predicates
Yuan-An Xiao, Weixuan Wang, Dong Liu, Junwei Zhou, Shengyu Cheng, Yingfei Xiong
TL;DR
PredicateFix tackles the brittleness of LLM-based static-analysis alert repair by introducing bridging predicates from analysis rules to identify high-quality key examples in a clean code corpus. The method builds a retrieval-augmented repair pipeline that collects a corpus, identifies and prioritizes key examples, and prompts LLMs with targeted demonstrations, significantly improving repair effectiveness across CodeQL and GoInsight. Evaluation across six LLMs and two analyzers shows substantial gains (up to 69.2% correct repairs) and outperforms several baselines, including history- and similarity-based retrieval, demonstrating strong cross-language generalization. The approach enables more reliable automated repair of rare or project-specific alerts and provides a cross-language CVE dataset to support future research.
Abstract
Fixing static analysis alerts in source code with Large Language Models (LLMs) is becoming increasingly popular. However, LLMs often hallucinate and perform poorly for complex and less common alerts. Retrieval-augmented generation (RAG) aims to solve this problem by providing the model with a relevant example, but existing approaches face the challenge of unsatisfactory quality of such examples. To address this challenge, we utilize the predicates in the analysis rule, which serve as a bridge between the alert and relevant code snippets within a clean code corpus, called key examples. Based on this insight, we propose an algorithm to retrieve key examples for an alert automatically, and build PredicateFix as a RAG pipeline to fix alerts from two static code analyzers: CodeQL and GoInsight. Evaluation with multiple LLMs shows that PredicateFix increases the number of correct repairs by 27.1% ~ 69.3%, significantly outperforming other baseline RAG approaches.
