Table of Contents
Fetching ...

A RAG Method for Source Code Inquiry Tailored to Long-Context LLMs

Toshihiro Kamiya

TL;DR

The paper tackles the challenge of long source-code inquiries under LLM context limits by proposing a Retrieval-Augmented Generation approach that derives an execution trace to build a call tree and extract relevant function source code. This information is incorporated as documents in the prompt, enabling the LLM to answer questions about a software product without loading its entire codebase. Experiments on the rich-cli OSS project using long-context LLMs show a consistent trend that including the call tree and ordered source code improves answer quality, though extremely large prompts can strain context-length limits. The work demonstrates a practical path for applying LLMs to complex software tasks and highlights design choices in prompt construction, with future directions toward automated prompt generation and broader task coverage.

Abstract

Although the context length limitation of large language models (LLMs) has been mitigated, it still hinders their application to software development tasks. This study proposes a method incorporating execution traces into RAG for inquiries about source code. Small-scale experiments confirm a tendency for the method to contribute to improving LLM response quality.

A RAG Method for Source Code Inquiry Tailored to Long-Context LLMs

TL;DR

The paper tackles the challenge of long source-code inquiries under LLM context limits by proposing a Retrieval-Augmented Generation approach that derives an execution trace to build a call tree and extract relevant function source code. This information is incorporated as documents in the prompt, enabling the LLM to answer questions about a software product without loading its entire codebase. Experiments on the rich-cli OSS project using long-context LLMs show a consistent trend that including the call tree and ordered source code improves answer quality, though extremely large prompts can strain context-length limits. The work demonstrates a practical path for applying LLMs to complex software tasks and highlights design choices in prompt construction, with future directions toward automated prompt generation and broader task coverage.

Abstract

Although the context length limitation of large language models (LLMs) has been mitigated, it still hinders their application to software development tasks. This study proposes a method incorporating execution traces into RAG for inquiries about source code. Small-scale experiments confirm a tendency for the method to contribute to improving LLM response quality.
Paper Structure (16 sections, 2 figures, 9 tables)

This paper contains 16 sections, 2 figures, 9 tables.

Figures (2)

  • Figure 1: Example prompt
  • Figure 2: Trend of evaluation scores