Table of Contents
Fetching ...

Quantifying the benefits of code hints for refactoring deprecated Java APIs

Cristina David, Pascal Kesseli, Daniel Kroening, Hanliang Zhang

TL;DR

The paper tackles the challenge of automatically refactoring deprecated Java APIs by introducing two complementary engines: a symbolic, type-directed, component-based synthesis engine (RefSym) and a neural, LLM-based engine (RefNeural) that leverage Javadoc code hints. A rigorous equivalence-checking framework ensures that generated refactorings preserve behavior, with a counterexample-guided loop driving synthesis. Empirical evaluation on 236 Oracle JDK 15 deprecated methods shows that code hints substantially boost automation, with the best engine solving up to 82% of benchmarks and even the weakest hint-enabled engine achieving 71% success, while benchmarks without hints see dramatic drops (as low as ~14%). The work demonstrates that including Javadoc hints in deprecated APIs can markedly improve automated migration, and it provides practical tooling (RefSym and RefNeural) and insights into the complementary strengths of symbolic pruning vs. LLM-based generation.

Abstract

When done manually, refactoring legacy code in order to eliminate uses of deprecated APIs is an error-prone and time-consuming process. In this paper, we investigate to which degree refactorings for deprecated Java APIs can be automated, and quantify the benefit of Javadoc code hints for this task. To this end, we build a symbolic and a neural engine for the automatic refactoring of deprecated APIs. The former is based on type-directed and component-based program synthesis, whereas the latter uses LLMs. We applied our engines to refactor the deprecated methods in the Oracle JDK 15. Our experiments show that code hints are enabling for the automation of this task: even the worst engine correctly refactors 71% of the tasks with code hints, which drops to at best 14% on tasks without. Adding more code hints to Javadoc can hence boost the refactoring of code that uses deprecated APIs.

Quantifying the benefits of code hints for refactoring deprecated Java APIs

TL;DR

The paper tackles the challenge of automatically refactoring deprecated Java APIs by introducing two complementary engines: a symbolic, type-directed, component-based synthesis engine (RefSym) and a neural, LLM-based engine (RefNeural) that leverage Javadoc code hints. A rigorous equivalence-checking framework ensures that generated refactorings preserve behavior, with a counterexample-guided loop driving synthesis. Empirical evaluation on 236 Oracle JDK 15 deprecated methods shows that code hints substantially boost automation, with the best engine solving up to 82% of benchmarks and even the weakest hint-enabled engine achieving 71% success, while benchmarks without hints see dramatic drops (as low as ~14%). The work demonstrates that including Javadoc hints in deprecated APIs can markedly improve automated migration, and it provides practical tooling (RefSym and RefNeural) and insights into the complementary strengths of symbolic pruning vs. LLM-based generation.

Abstract

When done manually, refactoring legacy code in order to eliminate uses of deprecated APIs is an error-prone and time-consuming process. In this paper, we investigate to which degree refactorings for deprecated Java APIs can be automated, and quantify the benefit of Javadoc code hints for this task. To this end, we build a symbolic and a neural engine for the automatic refactoring of deprecated APIs. The former is based on type-directed and component-based program synthesis, whereas the latter uses LLMs. We applied our engines to refactor the deprecated methods in the Oracle JDK 15. Our experiments show that code hints are enabling for the automation of this task: even the worst engine correctly refactors 71% of the tasks with code hints, which drops to at best 14% on tasks without. Adding more code hints to Javadoc can hence boost the refactoring of code that uses deprecated APIs.

Paper Structure

This paper contains 36 sections, 5 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: Deprecated method example.
  • Figure 2: Code hints for the running example.
  • Figure 3: LLM Prompt Template.
  • Figure 4: Javadoc hint example.
  • Figure 5: Seeding algorithm for the CodeHints-library

Theorems & Definitions (7)

  • Example 1
  • Definition 1: Program equivalence with respect to a concrete input $\vec{i}$ [partial]
  • Example 2
  • Example 3
  • Example 4
  • Definition 2: Addition to Definition \ref{['def:prog-equiv']}
  • Definition 3: Realisable method