Table of Contents
Fetching ...

Prompt-based Code Completion via Multi-Retrieval Augmented Generation

Hanzhuo Tan, Qi Luo, Ling Jiang, Zizheng Zhan, Jing Li, Haotian Zhang, Yuqun Zhang

TL;DR

ProCC addresses the limitations of single-perspective retrieval in code completion by introducing a prompt-based multi-retriever system and an adaptive LinUCB-based retrieval selector to choose among lexical, hypothetical, and summarization perspectives. By leveraging prompts to obtain diverse semantic representations without training new encoders, ProCC enhances retrieval coverage and reduces reliance on costly fine-tuning. Empirical results show substantial improvements in Exact Match on both open-source and private-domain benchmarks, and additive gains when augmenting fine-tuned models, while maintaining computational efficiency. This approach offers a practical, scalable path to more coherent, context-aware code completion in real-world settings.

Abstract

Automated code completion, aiming at generating subsequent tokens from unfinished code, has been significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence issues and hallucinations when dealing with complex code logic or extrapolating beyond their training data. Existing Retrieval Augmented Generation (RAG) techniques partially address these issues by retrieving relevant code with a separate encoding model where the retrieved snippet serves as contextual reference for code completion. However, their retrieval scope is subject to a singular perspective defined by the encoding model, which largely overlooks the complexity and diversity inherent in code semantics. To address this limitation, we propose ProCC, a code completion framework leveraging prompt engineering and the contextual multi-armed bandits algorithm to flexibly incorporate and adapt to multiple perspectives of code. ProCC first employs a prompt-based multi-retriever system which crafts prompt templates to elicit LLM knowledge to understand code semantics with multiple retrieval perspectives. Then, it adopts the adaptive retrieval selection algorithm to incorporate code similarity into the decision-making process to determine the most suitable retrieval perspective for the LLM to complete the code. Experimental results demonstrate that ProCC outperforms state-of-the-art code completion technique by 8.6% on our collected open-source benchmark suite and 10.1% on the private-domain benchmark suite collected from a billion-user e-commerce company in terms of Exact Match. ProCC also allows augmenting fine-tuned techniques in a plug-and-play manner, yielding 5.6% improvement over our studied fine-tuned model.

Prompt-based Code Completion via Multi-Retrieval Augmented Generation

TL;DR

ProCC addresses the limitations of single-perspective retrieval in code completion by introducing a prompt-based multi-retriever system and an adaptive LinUCB-based retrieval selector to choose among lexical, hypothetical, and summarization perspectives. By leveraging prompts to obtain diverse semantic representations without training new encoders, ProCC enhances retrieval coverage and reduces reliance on costly fine-tuning. Empirical results show substantial improvements in Exact Match on both open-source and private-domain benchmarks, and additive gains when augmenting fine-tuned models, while maintaining computational efficiency. This approach offers a practical, scalable path to more coherent, context-aware code completion in real-world settings.

Abstract

Automated code completion, aiming at generating subsequent tokens from unfinished code, has been significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence issues and hallucinations when dealing with complex code logic or extrapolating beyond their training data. Existing Retrieval Augmented Generation (RAG) techniques partially address these issues by retrieving relevant code with a separate encoding model where the retrieved snippet serves as contextual reference for code completion. However, their retrieval scope is subject to a singular perspective defined by the encoding model, which largely overlooks the complexity and diversity inherent in code semantics. To address this limitation, we propose ProCC, a code completion framework leveraging prompt engineering and the contextual multi-armed bandits algorithm to flexibly incorporate and adapt to multiple perspectives of code. ProCC first employs a prompt-based multi-retriever system which crafts prompt templates to elicit LLM knowledge to understand code semantics with multiple retrieval perspectives. Then, it adopts the adaptive retrieval selection algorithm to incorporate code similarity into the decision-making process to determine the most suitable retrieval perspective for the LLM to complete the code. Experimental results demonstrate that ProCC outperforms state-of-the-art code completion technique by 8.6% on our collected open-source benchmark suite and 10.1% on the private-domain benchmark suite collected from a billion-user e-commerce company in terms of Exact Match. ProCC also allows augmenting fine-tuned techniques in a plug-and-play manner, yielding 5.6% improvement over our studied fine-tuned model.
Paper Structure (37 sections, 6 equations, 4 figures, 6 tables)

This paper contains 37 sections, 6 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Code completion scenarios demonstrating the contextual dependence for optimal retrievals. Red indicates the misleading information, Green represents the helpful hint.
  • Figure 2: The ProCC framework. The prompt-based multi-retriever system encodes the lexical semantics ①, hypothetical line ②, and code summarization ③ to derive multi-perspective representations. The adaptive retrieval selection algorithm ④ makes decisions based on code semantics similarities and selects the optimal context from retrievals. Finally, the selected code is concatenated with the unfinished code for augmented generation ⑤.
  • Figure 3: Venn diagram of different retrievers. It shows the number of samples that are completed correctly in the open-source benchmark.
  • Figure 4: Finetune v.s. ProCC