Exploring Large Language Models in Resolving Environment-Related Crash Bugs: Localizing and Repairing

Xueying Du; Mingwei Liu; Hanlin Wang; Juntao Li; Xin Peng; Yiling Lou

Exploring Large Language Models in Resolving Environment-Related Crash Bugs: Localizing and Repairing

Xueying Du, Mingwei Liu, Hanlin Wang, Juntao Li, Xin Peng, Yiling Lou

TL;DR

This work systematically compares LLMs' ability to localize and repair code- and environment-related crash bugs using a Stack Overflow–derived benchmark. It shows localization as the main bottleneck for code-related crashes while repair lags for environment-related ones, and demonstrates that advanced prompt strategies, multi-round interactions, and self-planning (IntDiagSolver) substantially boost performance. Across multiple LLMs and programming languages, IntDiagSolver yields consistent improvements in localization and repair, with notable gains in environment-related crash handling. The study contributes a comprehensive benchmark, an interactive methodology, and strong evidence for broad generalizability and practical impact in real-world software debugging. The results highlight the value of carefully designed prompts and proactive inquiry for aiding developers in crash diagnosis and remediation.

Abstract

Software crash bugs cause unexpected program behaviors or even abrupt termination, thus demanding immediate resolution. However, resolving crash bugs can be challenging due to their complex root causes, which can originate from issues in the source code or external factors like third-party library dependencies. Large language models (LLMs) have shown promise in software engineering tasks. However, existing research predominantly focuses on the capability of LLMs to localize and repair code-related crash bugs, leaving their effectiveness in resolving environment-related crash bugs in real-world software unexplored. To fill this gap, we conducted the first comprehensive study to assess the capability of LLMs in resolving real-world environment-related crash bugs. We first systematically compare LLMs' performance in resolving code-related and environment-related crash bugs with varying levels of crash contextual information. Our findings reveal that localization is the primary challenge for resolving code-related crashes, while repair poses a greater challenge for environment-related crashes. Furthermore, we investigate the impact of different prompt strategies on improving the resolution of environment-related crash bugs, incorporating different prompt templates and multi-round interactions. Building on this, we further explore an advanced active inquiry prompting strategy leveraging the self-planning capabilities of LLMs. Based on these explorations, we propose IntDiagSolver, an interactive methodology designed to enable precise crash bug resolution through ongoing engagement with LLMs. Extensive evaluations of IntDiagSolver across multiple LLMs (including GPT-3.5, GPT-4, Claude, CodeLlama, DeepSeek-R1, and Qwen-3-Coder) demonstrate consistent improvements in resolution accuracy, with substantial enhancements ranging from 9.1% to 43.3% in localization and 9.1% to 53.3% in repair.

Exploring Large Language Models in Resolving Environment-Related Crash Bugs: Localizing and Repairing

TL;DR

Abstract

Exploring Large Language Models in Resolving Environment-Related Crash Bugs: Localizing and Repairing

Authors

TL;DR

Abstract

Table of Contents

Figures (3)