Intention is All You Need: Refining Your Code from Your Intention
Qi Guo, Xiaofei Xie, Shangqing Liu, Ming Hu, Xiaohong Li, Lei Bu
TL;DR
This work tackles the problem of inefficient and error-prone code refinement by introducing an intention-based framework that decomposes the task into intention extraction from review comments and intention-guided code revision generation. By using a hybrid classifier to classify comments into Explicit, Reversion, and General intentions, and applying category-specific generation with varied prompting strategies, the approach achieves higher accuracy than end-to-end baselines across multiple LLMs. Key findings show 79% accuracy in intention extraction and up to 66% accuracy in revised code generation, with high-quality intentions and retrieval-augmented prompts driving the most substantial gains. The proposed dataset-cleaning evaluation further demonstrates that intention-guided verification improves data quality, underscoring the practical value of intention-driven code refinement for scalable software development.
Abstract
Code refinement aims to enhance existing code by addressing issues, refactoring, and optimizing to improve quality and meet specific requirements. As software projects scale in size and complexity, the traditional iterative exchange between reviewers and developers becomes increasingly burdensome. While recent deep learning techniques have been explored to accelerate this process, their performance remains limited, primarily due to challenges in accurately understanding reviewers' intents. This paper proposes an intention-based code refinement technique that enhances the conventional comment-to-code process by explicitly extracting reviewer intentions from the comments. Our approach consists of two key phases: Intention Extraction and Intention Guided Revision Generation. Intention Extraction categorizes comments using predefined templates, while Intention Guided Revision Generation employs large language models (LLMs) to generate revised code based on these defined intentions. Three categories with eight subcategories are designed for comment transformation, which is followed by a hybrid approach that combines rule-based and LLM-based classifiers for accurate classification. Extensive experiments with five LLMs (GPT4o, GPT3.5, DeepSeekV2, DeepSeek7B, CodeQwen7B) under different prompting settings demonstrate that our approach achieves 79% accuracy in intention extraction and up to 66% in code refinement generation. Our results highlight the potential of our approach in enhancing data quality and improving the efficiency of code refinement.
