Table of Contents
Fetching ...

CREME: Robustness Enhancement of Code LLMs via Layer-Aware Model Editing

Shuhan Liu, Xing Hu, Kerui Huang, Xiaohu Yang, David Lo, Xin Xia

TL;DR

<3-5 sentence high-level summary> CREME tackles the instability of code LLMs under natural prompt perturbations by introducing a lightweight, layer-aware model-editing framework. It uses causal tracing to locate robustness-sensitive layers and then performs targeted editing on the MLP output projection to align perturbed prompts with the original prompt’s robust behavior, guided by a joint alignment-preservation loss. The approach yields a 63% improvement in Pass@1 on perturbed prompts with negligible impact on clean inputs and introduces G-RIR as a metric for cross-perturbation generalization. Results across HumanEval and MBPP with CodeLlama and QwenCoder show strong robustness gains and insights into where robustness-related processing occurs within Transformer architectures. CREME thus offers a practical path to more reliable code generation without full retraining or architectural changes, and informs future robustness-focused editing strategies.

Abstract

Large language models (LLMs) have demonstrated impressive capabilities in code generation, where the natural language prompt plays a crucial role in conveying user intent to the model. However, prior studies have shown that LLMs are highly sensitive to prompt perturbations. Minor modifications in wording, syntax, or formatting can significantly reduce the functional correctness of generated code. As perturbations frequently occur in real-world scenarios, improving the robustness of LLMs to prompt perturbations is essential for ensuring reliable performance in practical code generation. In this paper, we introduce CREME (Code Robustness Enhancement via Model Editing), a novel approach that enhances LLM robustness through targeted parameter updates. CREME first identifies robustness-sensitive layers by comparing hidden states between an original prompt and its perturbed variant. Then, it performs lightweight parameter editing at the identified layer to reduce performance degradation. We evaluate CREME on two widely used code generation benchmarks (HumanEval and MBPP) along with their perturbed counterparts. Experimental results show that CREME improves Pass@1 accuracy by 63% on perturbed prompts while maintaining stable performance on clean inputs, with accuracy deviations within 1%. Further analysis reveals that robustness-sensitive layers are primarily concentrated in the middle and deeper layers of the network, and their locations vary across different model architectures. These insights provide a valuable foundation for developing future robustness-oriented editing strategies.

CREME: Robustness Enhancement of Code LLMs via Layer-Aware Model Editing

TL;DR

<3-5 sentence high-level summary> CREME tackles the instability of code LLMs under natural prompt perturbations by introducing a lightweight, layer-aware model-editing framework. It uses causal tracing to locate robustness-sensitive layers and then performs targeted editing on the MLP output projection to align perturbed prompts with the original prompt’s robust behavior, guided by a joint alignment-preservation loss. The approach yields a 63% improvement in Pass@1 on perturbed prompts with negligible impact on clean inputs and introduces G-RIR as a metric for cross-perturbation generalization. Results across HumanEval and MBPP with CodeLlama and QwenCoder show strong robustness gains and insights into where robustness-related processing occurs within Transformer architectures. CREME thus offers a practical path to more reliable code generation without full retraining or architectural changes, and informs future robustness-focused editing strategies.

Abstract

Large language models (LLMs) have demonstrated impressive capabilities in code generation, where the natural language prompt plays a crucial role in conveying user intent to the model. However, prior studies have shown that LLMs are highly sensitive to prompt perturbations. Minor modifications in wording, syntax, or formatting can significantly reduce the functional correctness of generated code. As perturbations frequently occur in real-world scenarios, improving the robustness of LLMs to prompt perturbations is essential for ensuring reliable performance in practical code generation. In this paper, we introduce CREME (Code Robustness Enhancement via Model Editing), a novel approach that enhances LLM robustness through targeted parameter updates. CREME first identifies robustness-sensitive layers by comparing hidden states between an original prompt and its perturbed variant. Then, it performs lightweight parameter editing at the identified layer to reduce performance degradation. We evaluate CREME on two widely used code generation benchmarks (HumanEval and MBPP) along with their perturbed counterparts. Experimental results show that CREME improves Pass@1 accuracy by 63% on perturbed prompts while maintaining stable performance on clean inputs, with accuracy deviations within 1%. Further analysis reveals that robustness-sensitive layers are primarily concentrated in the middle and deeper layers of the network, and their locations vary across different model architectures. These insights provide a valuable foundation for developing future robustness-oriented editing strategies.

Paper Structure

This paper contains 33 sections, 11 equations, 3 figures, 9 tables.

Figures (3)

  • Figure 1: Example of Code Generation using Original and Perturbed Prompts
  • Figure 2: CREME framework: ❶ (left) A slight perturbation is inserted to the original prompt. ❷ (middle) CREME identifies robustness-sensitive key layers by replacing each layer’s hidden states with those from the original prompt and evaluating recovery in pass rate. ❸ (right) The key layer is fine-tuned with two objectives: preservation loss (Loss1), which retains behavior on clean inputs, and alignment loss (Loss2), which enforces consistency between original and perturbed prompts.
  • Figure 3: Key Layer Distribution by Perturbation Type Group