Table of Contents
Fetching ...

LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models

Aoxiao Zhong, Dengyao Mo, Guiyang Liu, Jinbu Liu, Qingda Lu, Qi Zhou, Jiesheng Wu, Quanzheng Li, Qingsong Wen

TL;DR

LogParser-LLM tackles the challenge of scalable, accurate log parsing by fusing a prefix parse tree with an LLM-based template extractor, enabling semantic awareness without labeled data or extensive hyperparameter tuning. It introduces a Granularity Distance metric to quantify differences in parsing granularity and enables human-in-the-loop calibration to tailor results to specific needs. Evaluations on Loghub-2k and the large-scale LogPub benchmarks show substantial gains over state-of-the-art parsers, with efficient LLM usage (averaging around 272.5 calls on LogPub) and improved grouping and template accuracy. The work advances practical log analysis by combining semantic richness with syntactic efficiency, suitable for online, scalable AIOps tasks.

Abstract

Logs are ubiquitous digital footprints, playing an indispensable role in system diagnostics, security analysis, and performance optimization. The extraction of actionable insights from logs is critically dependent on the log parsing process, which converts raw logs into structured formats for downstream analysis. Yet, the complexities of contemporary systems and the dynamic nature of logs pose significant challenges to existing automatic parsing techniques. The emergence of Large Language Models (LLM) offers new horizons. With their expansive knowledge and contextual prowess, LLMs have been transformative across diverse applications. Building on this, we introduce LogParser-LLM, a novel log parser integrated with LLM capabilities. This union seamlessly blends semantic insights with statistical nuances, obviating the need for hyper-parameter tuning and labeled training data, while ensuring rapid adaptability through online parsing. Further deepening our exploration, we address the intricate challenge of parsing granularity, proposing a new metric and integrating human interactions to allow users to calibrate granularity to their specific needs. Our method's efficacy is empirically demonstrated through evaluations on the Loghub-2k and the large-scale LogPub benchmark. In evaluations on the LogPub benchmark, involving an average of 3.6 million logs per dataset across 14 datasets, our LogParser-LLM requires only 272.5 LLM invocations on average, achieving a 90.6% F1 score for grouping accuracy and an 81.1% for parsing accuracy. These results demonstrate the method's high efficiency and accuracy, outperforming current state-of-the-art log parsers, including pattern-based, neural network-based, and existing LLM-enhanced approaches.

LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models

TL;DR

LogParser-LLM tackles the challenge of scalable, accurate log parsing by fusing a prefix parse tree with an LLM-based template extractor, enabling semantic awareness without labeled data or extensive hyperparameter tuning. It introduces a Granularity Distance metric to quantify differences in parsing granularity and enables human-in-the-loop calibration to tailor results to specific needs. Evaluations on Loghub-2k and the large-scale LogPub benchmarks show substantial gains over state-of-the-art parsers, with efficient LLM usage (averaging around 272.5 calls on LogPub) and improved grouping and template accuracy. The work advances practical log analysis by combining semantic richness with syntactic efficiency, suitable for online, scalable AIOps tasks.

Abstract

Logs are ubiquitous digital footprints, playing an indispensable role in system diagnostics, security analysis, and performance optimization. The extraction of actionable insights from logs is critically dependent on the log parsing process, which converts raw logs into structured formats for downstream analysis. Yet, the complexities of contemporary systems and the dynamic nature of logs pose significant challenges to existing automatic parsing techniques. The emergence of Large Language Models (LLM) offers new horizons. With their expansive knowledge and contextual prowess, LLMs have been transformative across diverse applications. Building on this, we introduce LogParser-LLM, a novel log parser integrated with LLM capabilities. This union seamlessly blends semantic insights with statistical nuances, obviating the need for hyper-parameter tuning and labeled training data, while ensuring rapid adaptability through online parsing. Further deepening our exploration, we address the intricate challenge of parsing granularity, proposing a new metric and integrating human interactions to allow users to calibrate granularity to their specific needs. Our method's efficacy is empirically demonstrated through evaluations on the Loghub-2k and the large-scale LogPub benchmark. In evaluations on the LogPub benchmark, involving an average of 3.6 million logs per dataset across 14 datasets, our LogParser-LLM requires only 272.5 LLM invocations on average, achieving a 90.6% F1 score for grouping accuracy and an 81.1% for parsing accuracy. These results demonstrate the method's high efficiency and accuracy, outperforming current state-of-the-art log parsers, including pattern-based, neural network-based, and existing LLM-enhanced approaches.
Paper Structure (36 sections, 7 figures, 4 tables, 1 algorithm)

This paper contains 36 sections, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: An example of log parsing.
  • Figure 2: A demonstration of granularity variations in log parsing. Colors denote groups of templates. Applicability is represented on the vertical axis, while Specificity is represented on the horizontal axis.
  • Figure 3: An example demonstrating the data structures in our method: (a) A prefix parse tree with nodes linking to log clusters, (b) a Template Pool mapping log templates to log clusters, and (c) Log Clusters containing collections of logs with the same log template.
  • Figure 4: Variable-aware Prompt for Log Parsing
  • Figure 5: Prompt for fine-tuning
  • ...and 2 more figures