A Semantic-based Optimization Approach for Repairing LLMs: Case Study on Code Generation
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang
TL;DR
<3-5 sentence high-level summary> STAR reframes LM repair as a semantic-based optimization that targets knowledge neurons to fix code-generation failures with minimal data and computation. It introduces Semantic Targeting for Analytical Repair (STAR), which locates buggy neurons via attribution, computes semantic patches through a semantic-basis and least-squares framework, and patches neurons with a gradient-informed, prior-guided optimizer. Across multiple code-generation benchmarks and open-code LMs, STAR demonstrates superior effectiveness and efficiency with reduced side effects compared to MINT and SGD, especially when employing layer-wise sparsity patterns. The work advances practical LM repair by leveraging latent-space semantics to steer representations toward ground-truth semantics, and it provides empirical insights into sparsity, generalization, and cumulative repair effects on large code models.
Abstract
Language Models (LMs) are widely used in software engineering for code generation, but they may produce erroneous code. Rather than repairing outputs, a more thorough remedy is to address underlying model failures. LM repair offers a lightweight solution: it requires minimal data, lowers computational cost, and limits side effects. Unlike full retraining, LM repair focuses on applying tailored updates to targeted neurons, making it suitable for limited resources, high-performance demands, or strict safety requirements. In this paper, we propose Semantic Targeting for Analytical Repair (STAR), a novel semantic-based optimization method for repairing LLMs. STAR realizes the main operations of repairing LMs in an optimization process, including locating ``buggy neurons'', solving ``neuron patches'', and patching ``buggy neurons''. The neuron patches are computed with a solid semantic-based analytical formula, which directly bridges the changes to logits with the deltas of neurons, by steering latent representations. Compared to the prior work of LM repair (MINT) and standard optimization methods (SGD), STAR integrates their strengths while mitigating their limitations. By reformulating LM repair as an optimization process, STAR may solve multiple failures together, significantly improving the usefulness. Evaluated on coding tasks using popular code LMs, STAR demonstrates superior effectiveness compared with the state-of-the-art. Besides, STAR exhibits better efficiency. In terms of side effects, namely the balance between generalization and specificity, STAR outperforms prior work by a significant margin. Additionally, we conducted assessments on the overfitting risk of LM repair as well as the cumulative impact. Further, we analyzed the differences with pipeline-based methods and explained the reason why STAR is better and how it mitigated the common limitations of LM repair.
