Table of Contents
Fetching ...

MELT: Mining Effective Lightweight Transformations from Pull Requests

Daniel Ramos, Hailie Mitchell, Inês Lynce, Vasco Manquinho, Ruben Martins, Claire Le Goues

TL;DR

MELT tackles API-migration the problem of updating client code without relying on client data by mining merge pull requests from library repositories. It combines code diffs with large-language-model-generated transition examples and expresses transformations as concise, interpretable Comby rules, augmented by a generalization step to broaden applicability. The approach integrates into library workflows via CI and demonstrates substantial rule discovery (461 from code, 114 from generated examples) with a 9x increase in matches after generalization, and shows practical benefits by reducing test warnings on client projects. This work enables proactive, library-centered API migrations with actionable, maintainable rules that can be applied before client migrations are required, improving update safety and efficiency.

Abstract

Software developers often struggle to update APIs, leading to manual, time-consuming, and error-prone processes. We introduce MELT, a new approach that generates lightweight API migration rules directly from pull requests in popular library repositories. Our key insight is that pull requests merged into open-source libraries are a rich source of information sufficient to mine API migration rules. By leveraging code examples mined from the library source and automatically generated code examples based on the pull requests, we infer transformation rules in \comby, a language for structural code search and replace. Since inferred rules from single code examples may be too specific, we propose a generalization procedure to make the rules more applicable to client projects. MELT rules are syntax-driven, interpretable, and easily adaptable. Moreover, unlike previous work, our approach enables rule inference to seamlessly integrate into the library workflow, removing the need to wait for client code migrations. We evaluated MELT on pull requests from four popular libraries, successfully mining 461 migration rules from code examples in pull requests and 114 rules from auto-generated code examples. Our generalization procedure increases the number of matches for mined rules by 9x. We applied these rules to client projects and ran their tests, which led to an overall decrease in the number of warnings and fixing some test cases demonstrating MELT's effectiveness in real-world scenarios.

MELT: Mining Effective Lightweight Transformations from Pull Requests

TL;DR

MELT tackles API-migration the problem of updating client code without relying on client data by mining merge pull requests from library repositories. It combines code diffs with large-language-model-generated transition examples and expresses transformations as concise, interpretable Comby rules, augmented by a generalization step to broaden applicability. The approach integrates into library workflows via CI and demonstrates substantial rule discovery (461 from code, 114 from generated examples) with a 9x increase in matches after generalization, and shows practical benefits by reducing test warnings on client projects. This work enables proactive, library-centered API migrations with actionable, maintainable rules that can be applied before client migrations are required, improving update safety and efficiency.

Abstract

Software developers often struggle to update APIs, leading to manual, time-consuming, and error-prone processes. We introduce MELT, a new approach that generates lightweight API migration rules directly from pull requests in popular library repositories. Our key insight is that pull requests merged into open-source libraries are a rich source of information sufficient to mine API migration rules. By leveraging code examples mined from the library source and automatically generated code examples based on the pull requests, we infer transformation rules in \comby, a language for structural code search and replace. Since inferred rules from single code examples may be too specific, we propose a generalization procedure to make the rules more applicable to client projects. MELT rules are syntax-driven, interpretable, and easily adaptable. Moreover, unlike previous work, our approach enables rule inference to seamlessly integrate into the library workflow, removing the need to wait for client code migrations. We evaluated MELT on pull requests from four popular libraries, successfully mining 461 migration rules from code examples in pull requests and 114 rules from auto-generated code examples. Our generalization procedure increases the number of matches for mined rules by 9x. We applied these rules to client projects and ran their tests, which led to an overall decrease in the number of warnings and fixing some test cases demonstrating MELT's effectiveness in real-world scenarios.
Paper Structure (7 sections, 6 figures, 1 table, 1 algorithm)

This paper contains 7 sections, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Code change in pull request #44539 pandas-pr-44539 from the pandas-dev/pandas repository.
  • Figure 2: Melt overview. Melt takes as input a pull request (PR) and outputs a set of rules. The PR is processed in two ways: (1) the Code change analyzer identifies relevant code changes; (2) the Code generation model generates additional code examples. Rules are inferred from the code changes and examples using the rule inference algorithm, then filtered and generalized.
  • Figure 3: Pull Request #14419 scipy-github-pr-14419 from scipy/scipy. This pull request was part of SciPy 1.8, released in February 2022.
  • Figure 4: Code generated by GPT-4 showing how to transition from the deprecated namespace for and a test case.
  • Figure 5: Prompt template for the generateExample function in Algorithm 1, featuring four placeholders: (1) library_name, (2) additional_requirements for format consistency and correctness, (3) a concrete_example with summary and examples from pandas, and (4) pr_data, the PR information including title, description, changed files, and corresponding diffs, as JSON.
  • ...and 1 more figures