Pre-Filtering Code Suggestions using Developer Behavioral Telemetry to Optimize LLM-Assisted Programming
Mohammad Nour Al Awad, Sergey Ivanov, Olga Tikhonova
TL;DR
This work tackles wasted computation and interruptions in LLM-assisted programming by introducing a lightweight pre-filter that gates LLM invocation using real-time developer telemetry. The behavioral-only gate operates before any prompt or code content is analyzed, preserving privacy while remaining language-agnostic. Trained on 2,318 suggestion events and deployed in a production VS Code plugin, the approach increased suggestion acceptance from $18.4\%$ to $34.2\%$ and suppressed about $35\%$ of low-value LLM calls, without modifying the LLM itself. These results demonstrate that timing-aware, privacy-preserving adaptation based on behavioral signals can substantially improve both user experience and system efficiency in AI-assisted programming, and point to future work on personalization and richer developer-state modeling.
Abstract
Large Language Models (LLMs) are increasingly integrated into code editors to provide AI-powered code suggestions. Yet many of these suggestions are ignored, resulting in wasted computation, increased latency, and unnecessary interruptions. We introduce a lightweight pre-filtering model that predicts the likelihood of suggestion acceptance before invoking the LLM, using only real-time developer telemetry such as typing speed, file navigation, and editing activity. Deployed in a production-grade Visual Studio Code plugin over four months of naturalistic use, our approach nearly doubled acceptance rates (18.4% -> 34.2%) while suppressing 35% of low-value LLM calls. These findings demonstrate that behavioral signals alone can meaningfully improve both user experience and system efficiency in LLM-assisted programming, highlighting the value of timing-aware, privacy-preserving adaptation mechanisms. The filter operates solely on pre-invocation editor telemetry and never inspects code or prompts.
