Table of Contents
Fetching ...

Challenge on Optimization of Context Collection for Code Completion

Dmitry Ustalov, Egor Bogomolov, Alexander Bezzubov, Yaroslav Golubev, Evgeniy Glukhov, Georgii Levtsov, Vladimir Kovalenko

TL;DR

The paper addresses how context from entire codebases affects fill-in-the-middle code completion and introduces a public competition with Python and Kotlin tracks, evaluated across three LLMs using the chrF metric, denoted as $chrF$. It shows that effective context relies on a combination of symbol-definition extraction and retrieval-augmented context through techniques like BM25, FAISS, Tree-sitter, and trigram indexing. The results identify top-performing strategies and provide a publicly released dataset to guide future work in scalable, context-aware code completion for IDEs. These insights have practical impact for enhancing developer tooling and accelerating AI-assisted software engineering workflows.

Abstract

The rapid advancement of workflows and methods for software engineering using AI emphasizes the need for a systematic evaluation and analysis of their ability to leverage information from entire projects, particularly in large code bases. In this challenge on optimization of context collection for code completion, organized by JetBrains in collaboration with Mistral AI as part of the ASE 2025 conference, participants developed efficient mechanisms for collecting context from source code repositories to improve fill-in-the-middle code completions for Python and Kotlin. We constructed a large dataset of real-world code in these two programming languages using permissively licensed open-source projects. The submissions were evaluated based on their ability to maximize completion quality for multiple state-of-the-art neural models using the chrF metric. During the public phase of the competition, nineteen teams submitted solutions to the Python track and eight teams submitted solutions to the Kotlin track. In the private phase, six teams competed, of which five submitted papers to the workshop.

Challenge on Optimization of Context Collection for Code Completion

TL;DR

The paper addresses how context from entire codebases affects fill-in-the-middle code completion and introduces a public competition with Python and Kotlin tracks, evaluated across three LLMs using the chrF metric, denoted as . It shows that effective context relies on a combination of symbol-definition extraction and retrieval-augmented context through techniques like BM25, FAISS, Tree-sitter, and trigram indexing. The results identify top-performing strategies and provide a publicly released dataset to guide future work in scalable, context-aware code completion for IDEs. These insights have practical impact for enhancing developer tooling and accelerating AI-assisted software engineering workflows.

Abstract

The rapid advancement of workflows and methods for software engineering using AI emphasizes the need for a systematic evaluation and analysis of their ability to leverage information from entire projects, particularly in large code bases. In this challenge on optimization of context collection for code completion, organized by JetBrains in collaboration with Mistral AI as part of the ASE 2025 conference, participants developed efficient mechanisms for collecting context from source code repositories to improve fill-in-the-middle code completions for Python and Kotlin. We constructed a large dataset of real-world code in these two programming languages using permissively licensed open-source projects. The submissions were evaluated based on their ability to maximize completion quality for multiple state-of-the-art neural models using the chrF metric. During the public phase of the competition, nineteen teams submitted solutions to the Python track and eight teams submitted solutions to the Kotlin track. In the private phase, six teams competed, of which five submitted papers to the workshop.

Paper Structure

This paper contains 15 sections, 1 equation, 3 figures, 5 tables.

Figures (3)

  • Figure 1: When a completion is requested, the IDE gathers the context at the caret position and generates the corresponding prompt for the neural completion model. The model's output is subsequently post-processed and presented as the suggested completion. In our competition, we aim to identify the most effective method for context collection (green block) while assuming that all other components remain unchanged.
  • Figure 2: The simplest context collection: no context clues, only some of the lines before and after the CARET (editing cursor) position in the given file.
  • Figure 3: The same completion point as in Figure \ref{['fig:example:empty']}, but with context clues obtained from resolving the symbols mentioned in the code: Action type and the current class, Pipeline.