Knowledge-Guided Prompt Learning for Request Quality Assurance in Public Code Review
Lin Li, Xinchun Yu, Xinyu Chen, Peng Liang
TL;DR
Public Code Review (PCR) data tasks have been traditionally tackled from the reviewer side, leaving developer-facing quality assurance underexplored. KP-PCR reframes two PCR subtasks—request necessity prediction and tag recommendation—into a unified prompt-learning framework that combines text prompts, code PDG representations, and knowledge-guided soft prefixes from a fine-tuned knowledge model (DeepSeek). The method jointly optimizes a masked-language-model objective across [MASK] positions and uses an answer-engineering module to map predictions to final labels, achieving improvements of $2.3$–$8.4acksim$ in necessity prediction and $1.4$–$6.9acksim$ in tag recommendation over strong baselines, while maintaining efficiency with $O(n^2)$ time complexity dominated by code-graph construction. Ablation studies show PDG-based code representations and knowledge guidance are crucial, and fine-tuned LLM baselines lag behind KP-PCR, underscoring the value of task-specific prompt learning for domain knowledge in PCR. The approach offers a practical pathway to scalable, developer-centered PCR assistance with publicly available code and broader implications for domain-specific prompt engineering in software engineering tasks.
Abstract
Public Code Review (PCR) is developed in the Software Question Answering (SQA) community, assisting developers in exploring high-quality and efficient review services. Current methods on PCR mainly focus on the reviewer's perspective, including finding a capable reviewer, predicting comment quality, and recommending/generating review comments. However, it is not well studied that how to satisfy the review necessity requests posted by developers which can increase their visibility, which in turn acts as a prerequisite for better review responses. To this end, we propose K nowledge-guided P rompt learning for P ublic Code Review (KP-PCR) to achieve developer-based code review request quality assurance (i.e., predicting request necessity and recommending tags subtask). Specifically, we reformulate the two subtasks via 1) text prompt tuning which converts both of them into a Masked Language Model (MLM) by constructing prompt templates using hard prompt; and 2) knowledge and code prefix tuning which introduces knowledge guidance from fine-tuned large language models by soft prompt, and uses program dependence graph to characterize code snippets. Finally, both of the request necessity prediction and tag recommendation subtasks output predicted results through an answer engineering module. In addition, we further analysis the time complexity of our KP-PCR that has lightweight prefix based the operation of introducing knowledge guidance. Experimental results on the PCR dataset for the period 2011-2023 demonstrate that our KP-PCR outperforms baselines by 2.3%-8.4% in the request necessity prediction and by 1.4%-6.9% in the tag recommendation. The code implementation is released at https://github.com/WUT-IDEA/KP-PCR
