Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies
Yilun Liu, Shimin Tao, Weibin Meng, Jingyu Wang, Wenbing Ma, Yanqing Zhao, Yuhang Chen, Hao Yang, Yanfei Jiang, Xun Chen
TL;DR
This paper tackles the dual challenges of online log analysis and interpretability by introducing LogPrompt, a prompt-strategy framework that leverages large language models without requiring in-domain training. By standardizing input/output formats and employing self-prompting, chain-of-thought prompts, and in-context demonstrations, LogPrompt achieves strong zero-shot performance on log parsing and anomaly detection across nine datasets, and provides useful, readable explanations validated by practitioners. The approach also demonstrates robustness with open-source and smaller-scale LLMs, suggesting practical deployment viability. Collectively, LogPrompt offers a scalable, interpretable solution for maintenance and operations tasks in diverse software systems, with public code to foster further adoption and research.
Abstract
Automated log analysis is crucial in modern software-intensive systems for facilitating program comprehension throughout software maintenance and engineering life cycles. Existing methods perform tasks such as log parsing and log anomaly detection by providing a single prediction value without interpretation. However, given the increasing volume of system events, the limited interpretability of analysis results hinders analysts' comprehension of program status and their ability to take appropriate actions. Moreover, these methods require substantial in-domain training data, and their performance declines sharply (by up to 62.5%) in online scenarios involving unseen logs from new domains, a common occurrence due to rapid software updates. In this paper, we propose LogPrompt, a novel interpretable log analysis approach for online scenarios. LogPrompt employs large language models (LLMs) to perform online log analysis tasks via a suite of advanced prompt strategies tailored for log tasks, which enhances LLMs' performance by up to 380.7% compared with simple prompts. Experiments on nine publicly available evaluation datasets across two tasks demonstrate that LogPrompt, despite requiring no in-domain training, outperforms existing approaches trained on thousands of logs by up to 55.9%. We also conduct a human evaluation of LogPrompt's interpretability, with six practitioners possessing over 10 years of experience, who highly rated the generated content in terms of usefulness and readability (averagely 4.42/5). LogPrompt also exhibits remarkable compatibility with open-source and smaller-scale LLMs, making it flexible for practical deployment. Code of LogPrompt is available at https://github.com/lunyiliu/LogPrompt.
