Table of Contents
Fetching ...

Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing

Jiayi Wei, Greg Durrett, Isil Dillig

TL;DR

This work tackles multi-round code editing by modeling $P(\Delta u \mid \Delta_k \ldots \Delta_1, U)$. It introduces Coeditor, a CodeT5-based model that uses line-diff encoding and static-analysis-driven context with block-sparse attention to handle long histories of edits. Trained on PyCommits from 1,650 open-source Python projects, Coeditor outperforms GPT-3.5 and open-source code infilling models in single-round tasks (e.g., EM improving from 34.7 to 60.4) and delivers substantial gains in multi-round editing (46.7% lines gain and 28.6% keystrokes saved), enabling more efficient code maintenance. The authors release code, data, model weights, and a VSCode extension to support reproducible research and practical interactive editing workflows.

Abstract

Developers often dedicate significant time to maintaining and refactoring existing code. However, most prior work on generative models for code focuses solely on creating new code, overlooking the distinctive needs of editing existing code. In this work, we explore a multi-round code auto-editing setting, aiming to predict edits to a code region based on recent changes within the same codebase. Our model, Coeditor, is a fine-tuned language model specifically designed for code editing tasks. We represent code changes using a line diff format and employ static analysis to form large customized model contexts, ensuring the availability of appropriate information for prediction. We collect a code editing dataset from the commit histories of 1650 open-source Python projects for training and evaluation. In a simplified single-round, single-edit task, Coeditor significantly outperforms GPT-3.5 and SOTA open-source code completion models (bringing exact-match accuracy from 34.7 up to 60.4), demonstrating the benefits of incorporating editing history for code completion. In a multi-round, multi-edit setting, we observe substantial gains by iteratively conditioning on additional user edits. We have open-sourced our code, data, and model weights to encourage future research and have released a VSCode extension powered by our model for interactive IDE usage.

Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing

TL;DR

This work tackles multi-round code editing by modeling . It introduces Coeditor, a CodeT5-based model that uses line-diff encoding and static-analysis-driven context with block-sparse attention to handle long histories of edits. Trained on PyCommits from 1,650 open-source Python projects, Coeditor outperforms GPT-3.5 and open-source code infilling models in single-round tasks (e.g., EM improving from 34.7 to 60.4) and delivers substantial gains in multi-round editing (46.7% lines gain and 28.6% keystrokes saved), enabling more efficient code maintenance. The authors release code, data, model weights, and a VSCode extension to support reproducible research and practical interactive editing workflows.

Abstract

Developers often dedicate significant time to maintaining and refactoring existing code. However, most prior work on generative models for code focuses solely on creating new code, overlooking the distinctive needs of editing existing code. In this work, we explore a multi-round code auto-editing setting, aiming to predict edits to a code region based on recent changes within the same codebase. Our model, Coeditor, is a fine-tuned language model specifically designed for code editing tasks. We represent code changes using a line diff format and employ static analysis to form large customized model contexts, ensuring the availability of appropriate information for prediction. We collect a code editing dataset from the commit histories of 1650 open-source Python projects for training and evaluation. In a simplified single-round, single-edit task, Coeditor significantly outperforms GPT-3.5 and SOTA open-source code completion models (bringing exact-match accuracy from 34.7 up to 60.4), demonstrating the benefits of incorporating editing history for code completion. In a multi-round, multi-edit setting, we observe substantial gains by iteratively conditioning on additional user edits. We have open-sourced our code, data, and model weights to encourage future research and have released a VSCode extension powered by our model for interactive IDE usage.
Paper Structure (18 sections, 3 equations, 15 figures, 5 tables)

This paper contains 18 sections, 3 equations, 15 figures, 5 tables.

Figures (15)

  • Figure 1: The multi-round auto-editing task. The user inspects the model output in each editing round and can optionally perform manual editing.
  • Figure 2: An example usage of Coeditor. (a) The user first edits the pack_batch function to read an additional dictionary key, "cost", from each row in the input. (b) The user then removes 3 lines at the top of the group_to_batches function. (c) The user now invokes Coeditor at the bottom half of the same function. Coeditor correctly suggests adding a "cost" key to the dictionary variable row, but it fails to address the now undefined variables underlined in red. (d) However, if the user accepts the suggested change and manually introduces two new variables at line 209, Coeditor can then suggest the correct changes accordingly.
  • Figure 3: Coeditor encoding format. (Left) the input sequence adds placeholder tokens to indicate code region to edit. (Top right) the output sequence specifies further changes at each placeholder token. (Bottom right) relevant signatures are retrieved from the codebase and added to the context. (In this example, the Python module is called motivating).
  • Figure 4: Coeditor encoder sparse attention pattern. All attention between the reference blocks are skipped to avoid the quadratic cost of dense attention.
  • Figure 5: Code completion example 1. Coeditor sees from the relevant contextual changes (shown in \ref{['fig:completion-ex1-cont']}) that some get_asynclib() calls should be replaced with get_async_backend(), so it correctly suggested the change based on the deletion before the infilling point. InCoder was not able to see the deletion and infilled the original code given only the surrounding code.
  • ...and 10 more figures