Table of Contents
Fetching ...

LLM-Based Repair of Static Nullability Errors

Nima Karimipour, Pascal Joos, Michael Pradel, Martin Kellogg, Manu Sridharan

TL;DR

NullRepair tackles the gap between annotation inference and fully enrollable nullability checking in Java by combining static analysis with LLM-guided patches. It introduces a flowchart-based decision process and usage-region grounding to generate semantically aligned edits, rather than superficial fixes. On 12 real-world projects, it resolves about 63% of residual errors while largely preserving runtime behavior, outperforming naïve baselines in terms of patch quality and test-safety. The work provides a practical path toward automated migration to type-based nullability checkers, with a lightweight, auditable artifact and explicit considerations of false positives and code semantics.

Abstract

Modern Java projects increasingly adopt static analysis tools that prevent null-pointer exceptions by treating nullness as a type property. However, integrating such tools into large, existing codebases remains a significant challenge. While annotation inference can eliminate many errors automatically, a subset of residual errors -- typically a mix of real bugs and false positives -- often persist and can only be resolved via code changes. Manually addressing these errors is tedious and error-prone. Large language models (LLMs) offer a promising path toward automating these repairs, but naively-prompted LLMs often generate incorrect, contextually-inappropriate edits. We present NullRepair, a system that integrates LLMs into a structured workflow for resolving the errors from a nullability checker. NullRepair's decision process follows a flowchart derived from manual analysis of 200 real-world errors. It leverages static analysis to identify safe and unsafe usage regions of symbols, using error-free usage examples to contextualize model prompts. Patches are generated through an iterative interaction with the LLM that incorporates project-wide context and decision logic. Our evaluation on 12 real-world Java projects shows that NullRepair resolves 63% of the 1,119 nullability errors that remain after applying a state-of-the-art annotation inference technique. Unlike two baselines (single-shot prompt and mini-SWE-agent), NullRepair also largely preserves program semantics, with all unit tests passing in 10/12 projects after applying every edit proposed by NullRepair, and 98% or more tests passing in the remaining two projects.

LLM-Based Repair of Static Nullability Errors

TL;DR

NullRepair tackles the gap between annotation inference and fully enrollable nullability checking in Java by combining static analysis with LLM-guided patches. It introduces a flowchart-based decision process and usage-region grounding to generate semantically aligned edits, rather than superficial fixes. On 12 real-world projects, it resolves about 63% of residual errors while largely preserving runtime behavior, outperforming naïve baselines in terms of patch quality and test-safety. The work provides a practical path toward automated migration to type-based nullability checkers, with a lightweight, auditable artifact and explicit considerations of false positives and code semantics.

Abstract

Modern Java projects increasingly adopt static analysis tools that prevent null-pointer exceptions by treating nullness as a type property. However, integrating such tools into large, existing codebases remains a significant challenge. While annotation inference can eliminate many errors automatically, a subset of residual errors -- typically a mix of real bugs and false positives -- often persist and can only be resolved via code changes. Manually addressing these errors is tedious and error-prone. Large language models (LLMs) offer a promising path toward automating these repairs, but naively-prompted LLMs often generate incorrect, contextually-inappropriate edits. We present NullRepair, a system that integrates LLMs into a structured workflow for resolving the errors from a nullability checker. NullRepair's decision process follows a flowchart derived from manual analysis of 200 real-world errors. It leverages static analysis to identify safe and unsafe usage regions of symbols, using error-free usage examples to contextualize model prompts. Patches are generated through an iterative interaction with the LLM that incorporates project-wide context and decision logic. Our evaluation on 12 real-world Java projects shows that NullRepair resolves 63% of the 1,119 nullability errors that remain after applying a state-of-the-art annotation inference technique. Unlike two baselines (single-shot prompt and mini-SWE-agent), NullRepair also largely preserves program semantics, with all unit tests passing in 10/12 projects after applying every edit proposed by NullRepair, and 98% or more tests passing in the remaining two projects.

Paper Structure

This paper contains 31 sections, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Example of a nullability error and its resolution.
  • Figure 2: Code example to be enrolled into a nullability checker with a residual nullability error (line 12). Annotations in blue are added by an annotation inference tool like Annotator nullawayannotator, and are part of NullRepair's input. NullRepair proposes the change that removes the red line and replaces it with the green lines (lines 13-16). Lines 23-26 show a safe usage pattern of user.getMapView().
  • Figure 3: Decision flowchart guiding NullRepair's patch synthesis. Blue nodes denote decision points, orange nodes indicate action steps, green nodes are terminals where patches are generated.
  • Figure 4: Example of a false positive suppression.
  • Figure 5: Example of a prompt for patch generation, with the model response. Blue text are program elements extracted from the codebase; the remaining text is part of the static prompt template.
  • ...and 3 more figures