Table of Contents
Fetching ...

Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intent

Geert Heyman, Tom Van Cutsem

TL;DR

Annotated code search leverages natural language descriptions paired with code to better capture code intent. The authors develop a domain-specific retrieval framework with separate description-query and code-query embeddings, and an ensemble that combines them; three PACS benchmarks (CoNaLa, StaQC-py, SO-DS) demonstrate substantial gains over code-only baselines, with up to 20.6% improvements in MRR and notable recall gains. The work highlights the value of descriptions for code search, demonstrates effective fine-tuning of the Universal Sentence Encoder for software-domain similarity, and shows that combining description and code signals yields the strongest performance, while noting challenges such as code evolution and dataset quality.

Abstract

In this work, we propose and study annotated code search: the retrieval of code snippets paired with brief descriptions of their intent using natural language queries. On three benchmark datasets, we investigate how code retrieval systems can be improved by leveraging descriptions to better capture the intents of code snippets. Building on recent progress in transfer learning and natural language processing, we create a domain-specific retrieval model for code annotated with a natural language description. We find that our model yields significantly more relevant search results (with absolute gains up to 20.6% in mean reciprocal rank) compared to state-of-the-art code retrieval methods that do not use descriptions but attempt to compute the intent of snippets solely from unannotated code.

Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intent

TL;DR

Annotated code search leverages natural language descriptions paired with code to better capture code intent. The authors develop a domain-specific retrieval framework with separate description-query and code-query embeddings, and an ensemble that combines them; three PACS benchmarks (CoNaLa, StaQC-py, SO-DS) demonstrate substantial gains over code-only baselines, with up to 20.6% improvements in MRR and notable recall gains. The work highlights the value of descriptions for code search, demonstrates effective fine-tuning of the Universal Sentence Encoder for software-domain similarity, and shows that combining description and code signals yields the strongest performance, while noting challenges such as code evolution and dataset quality.

Abstract

In this work, we propose and study annotated code search: the retrieval of code snippets paired with brief descriptions of their intent using natural language queries. On three benchmark datasets, we investigate how code retrieval systems can be improved by leveraging descriptions to better capture the intents of code snippets. Building on recent progress in transfer learning and natural language processing, we create a domain-specific retrieval model for code annotated with a natural language description. We find that our model yields significantly more relevant search results (with absolute gains up to 20.6% in mean reciprocal rank) compared to state-of-the-art code retrieval methods that do not use descriptions but attempt to compute the intent of snippets solely from unannotated code.

Paper Structure

This paper contains 22 sections, 5 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Illustration of the CoNaLa corpus yin2018mining is converted in a benchmark for annotated code search.
  • Figure 2: Illustration of the additional cleaning for the StaQC corpus yao2018staqc: we strip prompts, filter out code snippets that do not parse, and automatically rewrite questions as descriptions with simple regular expressions.
  • Figure 3: To create the StaQC-py and SO-DS ground truth we make use of Stack Overflow posts that were tagged as duplicates by users.
  • Figure 4: Histograms of the relative word overlap between the queries and the matching snippet descriptions in the PACS test sets.
  • Figure 5: Architectures of query/snippet similarity models presented in Section \ref{['sec:retrieval_model']}: A) depicts the neural bag-of-words model of queries and snippet descriptions; B) depicts the query and snippet description embedding using USE; C) depicts the NCS architecture for embedding queries and actual code; and D) depicts an ensemble of embedding retrieval models.
  • ...and 1 more figures