CodeWatcher: IDE Telemetry Data Extraction Tool for Understanding Coding Interactions with LLMs
Manaal Basha, Aimeê M. Ribeiro, Jeena Javahar, Cleidson R. B. de Souza, Gema Rodríguez-Pérez
TL;DR
This work introduces CodeWatcher, a lightweight client-server system that unobtrusively captures fine-grained IDE interactions between developers and code-generation tools within VS Code. By logging a semantically rich event stream (Start, End, Insertion, Deletion, Focus, Unfocus, Copy, Paste) and storing it in a MongoDB backend via a Python FastAPI server, the approach enables post-hoc session reconstruction and real-time feedback for CGT usage analysis. The paper details the event taxonomy, the RESTful architecture, and validation across diverse use cases, including an empirical method to classify final code as AI-generated, AI-modified, or user-written, with promising yet improvable precision and recall. Overall, CodeWatcher provides a scalable, adaptable foundation for studying CGT impact on productivity, learning, and responsible AI in software development, with broad potential for education, research, and industry use.
Abstract
Understanding how developers interact with code generation tools (CGTs) requires detailed, real-time data on programming behavior which is often difficult to collect without disrupting workflow. We present \textit{CodeWatcher}, a lightweight, unobtrusive client-server system designed to capture fine-grained interaction events from within the Visual Studio Code (VS Code) editor. \textit{CodeWatcher} logs semantically meaningful events such as insertions made by CGTs, deletions, copy-paste actions, and focus shifts, enabling continuous monitoring of developer activity without modifying user workflows. The system comprises a VS Code plugin, a Python-based RESTful API, and a MongoDB backend, all containerized for scalability and ease of deployment. By structuring and timestamping each event, \textit{CodeWatcher} enables post-hoc reconstruction of coding sessions and facilitates rich behavioral analyses, including how and when CGTs are used during development. This infrastructure is crucial for supporting research on responsible AI, developer productivity, and the human-centered evaluation of CGTs. Please find the demo, diagrams, and tool here: https://osf.io/j2kru/overview.
