Table of Contents
Fetching ...

SCAFFOLD-CEGIS: Preventing Latent Security Degradation in LLM-Driven Iterative Code Refinement

Yi Chen, Yun Bian, Haiquan Wang, Shihao Li, Zhe Cui

TL;DR

The SCAFFOLD-CEGIS framework is proposed, a multi-agent collaborative architecture that transforms security constraints from implicit prompts into explicit verifiable constraints and automatically identifies and solidifies security-critical elements as hard constraints through semantic anchoring, enforces safety monotonicity through four-layer gated verification, and continuously assimilates experience from failures.

Abstract

The application of large language models to code generation has evolved from one-shot generation to iterative refinement, yet the evolution of security throughout iteration remains insufficiently understood. Through comparative experiments on three mainstream LLMs, this paper reveals the iterative refinement paradox: specification drift during multi-objective optimization causes security to degrade gradually over successive iterations. Taking GPT-4o as an example, 43.7 % of iteration chains contain more vulnerabilities than the baseline after ten rounds, and cross-model experiments show that this phenomenon is prevalent. Further analysis shows that simply introducing static application security testing (SAST) gating cannot effectively suppress degradation; instead, it increases the latent security degradation rate from 12.5% under the unprotected baseline to 20.8 %. The root cause is that static-analysis rules cannot cover structural degradations such as the removal of defensive logic or the weakening of exception handling. To address this problem, we propose the SCAFFOLD-CEGIS framework. Drawing on the counterexample-guided inductive synthesis (CEGIS) paradigm, the framework adopts a multi-agent collaborative architecture that transforms security constraints from implicit prompts into explicit verifiable constraints. It automatically identifies and solidifies security-critical elements as hard constraints through semantic anchoring, enforces safety monotonicity through four-layer gated verification, and continuously assimilates experience from failures. Comparative experiments against six existing defense methods show that the full framework reduces the latent security degradation rate to 2.1% and achieves a safety monotonicity rate of 100%.

SCAFFOLD-CEGIS: Preventing Latent Security Degradation in LLM-Driven Iterative Code Refinement

TL;DR

The SCAFFOLD-CEGIS framework is proposed, a multi-agent collaborative architecture that transforms security constraints from implicit prompts into explicit verifiable constraints and automatically identifies and solidifies security-critical elements as hard constraints through semantic anchoring, enforces safety monotonicity through four-layer gated verification, and continuously assimilates experience from failures.

Abstract

The application of large language models to code generation has evolved from one-shot generation to iterative refinement, yet the evolution of security throughout iteration remains insufficiently understood. Through comparative experiments on three mainstream LLMs, this paper reveals the iterative refinement paradox: specification drift during multi-objective optimization causes security to degrade gradually over successive iterations. Taking GPT-4o as an example, 43.7 % of iteration chains contain more vulnerabilities than the baseline after ten rounds, and cross-model experiments show that this phenomenon is prevalent. Further analysis shows that simply introducing static application security testing (SAST) gating cannot effectively suppress degradation; instead, it increases the latent security degradation rate from 12.5% under the unprotected baseline to 20.8 %. The root cause is that static-analysis rules cannot cover structural degradations such as the removal of defensive logic or the weakening of exception handling. To address this problem, we propose the SCAFFOLD-CEGIS framework. Drawing on the counterexample-guided inductive synthesis (CEGIS) paradigm, the framework adopts a multi-agent collaborative architecture that transforms security constraints from implicit prompts into explicit verifiable constraints. It automatically identifies and solidifies security-critical elements as hard constraints through semantic anchoring, enforces safety monotonicity through four-layer gated verification, and continuously assimilates experience from failures. Comparative experiments against six existing defense methods show that the full framework reduces the latent security degradation rate to 2.1% and achieves a safety monotonicity rate of 100%.
Paper Structure (24 sections, 9 equations, 3 figures, 6 tables, 1 algorithm)

This paper contains 24 sections, 9 equations, 3 figures, 6 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overview of the SCAFFOLD-CEGIS architecture. The system follows a CEGIS-style workflow with four collaborative agents: (1) SecurityArchitectAgent mines semantic anchors from code to construct the security specification $\Phi$; (2) ImplementerAgent generates candidate code $P'$ under anchor constraints; (3) GatekeeperAgent validates candidates with four-layer gating; (4) AssimilatorAgent extracts reusable feedback from failures.
  • Figure 2: Vulnerability trajectories under the iterative refinement paradox. (a) Growth in vulnerabilities detected by static analysis, where feature-enhancement prompts degrade most severely; (b) issues detected by LLM semantic review, with counts substantially higher than static-analysis findings.
  • Figure 3: Representative cases of SAST coverage blind spots. Each row shows one code-transition scenario: safe baseline code (left), unsafe modification induced by user request (middle), and outcomes under different detection methods (right). Semgrep does not detect these degradations, while SCAFFOLD-CEGIS prevents them via semantic anchors.