Table of Contents
Fetching ...

Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial Verification

Aofan Liu, Shiyuan Song, Haoxuan Li, Cehao Yang, Yiyan Qi

TL;DR

This work targets the gap in repository-level code retrieval under change requests by introducing RepoAlign-Bench, a 52k-scale benchmark spanning direct-function to cross-component modifications, and ReflectCode, a reflection-augmented dual-tower model with LLM-guided verification. ReflectCode combines disentangled code and doc encoders, an adaptive-margin triplet loss, and an adversarial verification loop to refine top-k candidates, achieving state-of-the-art results with a notable improvement in Top-5 accuracy and recall over baselines. The approach leverages AST-based context, cross-component dependency modeling, and dynamic negative mining to better capture repository-wide intents, translating to practical gains in code maintenance tasks. While showing strong performance, the work also discusses latency concerns and avenues for broader language coverage, semantics-aware dependency modeling, and deployment-ready optimizations for IDE integration.

Abstract

The escalating complexity of modern codebases has intensified the need for retrieval systems capable of interpreting cross-component change intents, a capability fundamentally absent in conventional function-level search paradigms. While recent studies have improved the alignment between natural language queries and code snippets, retrieving contextually relevant code for specific change requests remains largely underexplored. To address this gap, we introduce RepoAlign-Bench, the first benchmark specifically designed to evaluate repository-level code retrieval under change request driven scenarios, encompassing 52k annotated instances. This benchmark shifts the retrieval paradigm from function-centric matching to holistic repository-level reasoning. Furthermore, we propose ReflectCode, an adversarial reflection augmented dual-tower architecture featuring disentangled code_encoder and doc_encoder components. ReflectCode dynamically integrates syntactic patterns, function dependencies, and semantic expansion intents through large language model guided reflection. Comprehensive experiments demonstrate that ReflectCode achieves 12.2% improvement in Top-5 Accuracy and 7.1% in Recall over state-of-the-art baselines, establishing a new direction for context-aware code retrieval.

Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial Verification

TL;DR

This work targets the gap in repository-level code retrieval under change requests by introducing RepoAlign-Bench, a 52k-scale benchmark spanning direct-function to cross-component modifications, and ReflectCode, a reflection-augmented dual-tower model with LLM-guided verification. ReflectCode combines disentangled code and doc encoders, an adaptive-margin triplet loss, and an adversarial verification loop to refine top-k candidates, achieving state-of-the-art results with a notable improvement in Top-5 accuracy and recall over baselines. The approach leverages AST-based context, cross-component dependency modeling, and dynamic negative mining to better capture repository-wide intents, translating to practical gains in code maintenance tasks. While showing strong performance, the work also discusses latency concerns and avenues for broader language coverage, semantics-aware dependency modeling, and deployment-ready optimizations for IDE integration.

Abstract

The escalating complexity of modern codebases has intensified the need for retrieval systems capable of interpreting cross-component change intents, a capability fundamentally absent in conventional function-level search paradigms. While recent studies have improved the alignment between natural language queries and code snippets, retrieving contextually relevant code for specific change requests remains largely underexplored. To address this gap, we introduce RepoAlign-Bench, the first benchmark specifically designed to evaluate repository-level code retrieval under change request driven scenarios, encompassing 52k annotated instances. This benchmark shifts the retrieval paradigm from function-centric matching to holistic repository-level reasoning. Furthermore, we propose ReflectCode, an adversarial reflection augmented dual-tower architecture featuring disentangled code_encoder and doc_encoder components. ReflectCode dynamically integrates syntactic patterns, function dependencies, and semantic expansion intents through large language model guided reflection. Comprehensive experiments demonstrate that ReflectCode achieves 12.2% improvement in Top-5 Accuracy and 7.1% in Recall over state-of-the-art baselines, establishing a new direction for context-aware code retrieval.

Paper Structure

This paper contains 41 sections, 9 equations, 2 figures, 7 tables, 2 algorithms.

Figures (2)

  • Figure 1: Visualization of a code patch in Astropy's modeling module addressing an issue with ITRS. The image highlights the updated implementation of the tete_to_its_mat and itrs_to_tete functions, alongside a description of the issue, the corresponding patch, and parsed information such as functions and classes.
  • Figure 2: The model consists of a generator and discriminator. The generator includes separate encoders for code and documentation, followed by dense blocks, softmax, and a retrieval mechanism for top-k matching. The discriminator incorporates self-reflection and a pairwise evaluator to refine the model's output based on reflective text.