Table of Contents
Fetching ...

A Transformer-Based Approach for Smart Invocation of Automatic Code Completion

Aral de Moor, Arie van Deursen, Maliheh Izadi

TL;DR

The paper tackles the problem of intrusive and costly transformer-based code completion by proposing JonBERTa, a lightweight, multimodal filter that decides when to invoke a code completion model using code context and in-IDE telemetry. Through offline and online evaluations on a Code4Me-derived dataset (~200k interactions, with ~10k training samples and ~20k test samples), the authors demonstrate that code context substantially improves invocation filtering over telemetry-only baselines, and that incorporating telemetry into a transformer can yield further gains. An online deployment with 34 developers and 74k invocations confirms practical viability, showing low latency (sub-10 ms) and favorable harmonic-mean performance, with certain JonBERTa variants achieving the best results. The work highlights the value of multimodal, latency-aware invocation control for transformer-based coding assistants and outlines future directions including larger datasets, personalization, and long-term impact studies.

Abstract

Transformer-based language models are highly effective for code completion, with much research dedicated to enhancing the content of these completions. Despite their effectiveness, these models come with high operational costs and can be intrusive, especially when they suggest too often and interrupt developers who are concentrating on their work. Current research largely overlooks how these models interact with developers in practice and neglects to address when a developer should receive completion suggestions. To tackle this issue, we developed a machine learning model that can accurately predict when to invoke a code completion tool given the code context and available telemetry data. To do so, we collect a dataset of 200k developer interactions with our cross-IDE code completion plugin and train several invocation filtering models. Our results indicate that our small-scale transformer model significantly outperforms the baseline while maintaining low enough latency. We further explore the search space for integrating additional telemetry data into a pre-trained transformer directly and obtain promising results. To further demonstrate our approach's practical potential, we deployed the model in an online environment with 34 developers and provided real-world insights based on 74k actual invocations.

A Transformer-Based Approach for Smart Invocation of Automatic Code Completion

TL;DR

The paper tackles the problem of intrusive and costly transformer-based code completion by proposing JonBERTa, a lightweight, multimodal filter that decides when to invoke a code completion model using code context and in-IDE telemetry. Through offline and online evaluations on a Code4Me-derived dataset (~200k interactions, with ~10k training samples and ~20k test samples), the authors demonstrate that code context substantially improves invocation filtering over telemetry-only baselines, and that incorporating telemetry into a transformer can yield further gains. An online deployment with 34 developers and 74k invocations confirms practical viability, showing low latency (sub-10 ms) and favorable harmonic-mean performance, with certain JonBERTa variants achieving the best results. The work highlights the value of multimodal, latency-aware invocation control for transformer-based coding assistants and outlines future directions including larger datasets, personalization, and long-term impact studies.

Abstract

Transformer-based language models are highly effective for code completion, with much research dedicated to enhancing the content of these completions. Despite their effectiveness, these models come with high operational costs and can be intrusive, especially when they suggest too often and interrupt developers who are concentrating on their work. Current research largely overlooks how these models interact with developers in practice and neglects to address when a developer should receive completion suggestions. To tackle this issue, we developed a machine learning model that can accurately predict when to invoke a code completion tool given the code context and available telemetry data. To do so, we collect a dataset of 200k developer interactions with our cross-IDE code completion plugin and train several invocation filtering models. Our results indicate that our small-scale transformer model significantly outperforms the baseline while maintaining low enough latency. We further explore the search space for integrating additional telemetry data into a pre-trained transformer directly and obtain promising results. To further demonstrate our approach's practical potential, we deployed the model in an online environment with 34 developers and provided real-world insights based on 74k actual invocations.
Paper Structure (37 sections, 6 figures, 13 tables)

This paper contains 37 sections, 6 figures, 13 tables.

Figures (6)

  • Figure 1: Reasons for Rejected Completions.
  • Figure 2: Telemetry Feature Data in Classification Head.
  • Figure 3: Self-Attention Extended to Telemetry Feature Data
  • Figure 4: Copilot's Language Map. Higher-scoring languages are more likely to get a completion.
  • Figure 5: Copilot's Prefix Character Map. Higher-scoring characters (directly before the cursor) are more likely to get a completion.
  • ...and 1 more figures