Table of Contents
Fetching ...

RefleXGen:The unexamined code is not worth using

Bin Wang, Hui Li, AoFan Liu, BoTao Yang, Ao Yang, YiLu Zhong, Weixiang Huang, Yanping Zhang, Runhuai Huang, Weimin Zeng

TL;DR

This work tackles the security of AI-generated code by introducing RefleXGen, a self-reflective framework that leverages Retrieval-Augmented Generation to iteratively improve code safety without fine-tuning or dataset creation. By maintaining a dynamic security knowledge base built from the model's reflections and secure snippets, RefleXGen guides subsequent code generation cycles toward safer outputs. Experimental results across GPT-3.5 Turbo, GPT-4o, CodeQwen, and Gemini show meaningful improvements in code security, demonstrating the practicality of self-reflection as a resource-efficient strategy for secure code generation. The approach highlights the potential of reflective mechanisms to autonomously enhance code safety in real-world deployments.

Abstract

Security in code generation remains a pivotal challenge when applying large language models (LLMs). This paper introduces RefleXGen, an innovative method that significantly enhances code security by integrating Retrieval-Augmented Generation (RAG) techniques with guided self-reflection mechanisms inherent in LLMs. Unlike traditional approaches that rely on fine-tuning LLMs or developing specialized secure code datasets - processes that can be resource-intensive - RefleXGen iteratively optimizes the code generation process through self-assessment and reflection without the need for extensive resources. Within this framework, the model continuously accumulates and refines its knowledge base, thereby progressively improving the security of the generated code. Experimental results demonstrate that RefleXGen substantially enhances code security across multiple models, achieving a 13.6% improvement with GPT-3.5 Turbo, a 6.7% improvement with GPT-4o, a 4.5% improvement with CodeQwen, and a 5.8% improvement with Gemini. Our findings highlight that improving the quality of model self-reflection constitutes an effective and practical strategy for strengthening the security of AI-generated code.

RefleXGen:The unexamined code is not worth using

TL;DR

This work tackles the security of AI-generated code by introducing RefleXGen, a self-reflective framework that leverages Retrieval-Augmented Generation to iteratively improve code safety without fine-tuning or dataset creation. By maintaining a dynamic security knowledge base built from the model's reflections and secure snippets, RefleXGen guides subsequent code generation cycles toward safer outputs. Experimental results across GPT-3.5 Turbo, GPT-4o, CodeQwen, and Gemini show meaningful improvements in code security, demonstrating the practicality of self-reflection as a resource-efficient strategy for secure code generation. The approach highlights the potential of reflective mechanisms to autonomously enhance code safety in real-world deployments.

Abstract

Security in code generation remains a pivotal challenge when applying large language models (LLMs). This paper introduces RefleXGen, an innovative method that significantly enhances code security by integrating Retrieval-Augmented Generation (RAG) techniques with guided self-reflection mechanisms inherent in LLMs. Unlike traditional approaches that rely on fine-tuning LLMs or developing specialized secure code datasets - processes that can be resource-intensive - RefleXGen iteratively optimizes the code generation process through self-assessment and reflection without the need for extensive resources. Within this framework, the model continuously accumulates and refines its knowledge base, thereby progressively improving the security of the generated code. Experimental results demonstrate that RefleXGen substantially enhances code security across multiple models, achieving a 13.6% improvement with GPT-3.5 Turbo, a 6.7% improvement with GPT-4o, a 4.5% improvement with CodeQwen, and a 5.8% improvement with Gemini. Our findings highlight that improving the quality of model self-reflection constitutes an effective and practical strategy for strengthening the security of AI-generated code.

Paper Structure

This paper contains 9 sections, 5 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: The diagram presents the structured workflow of the ReflexGen methodology, segmented into three critical stages: ① Initial Code Generation, ② Knowledge-Driven Security Feedback, and ③ Defect Fixing and Knowledge Integration. The process initiates with the generation of initial code. If, upon introspection, the model discerns security deficiencies in the code, it activates Step 2. This stage entails rigorous reflection and optimization to address and rectify vulnerabilities. Subsequently, through a cyclical process of secure code production, insights derived from this reflective phase are systematically integrated into the security knowledge base, thus promoting continual enhancements.
  • Figure 2: Sec.Rate Difference among Cases of RefleXGen