LogLM: From Task-based to Instruction-based Automated Log Analysis
Yilun Liu, Yuhe Ji, Shimin Tao, Minggui He, Weibin Meng, Shenglin Zhang, Yongqian Sun, Yuming Xie, Boxing Chen, Hao Yang
TL;DR
LogLM reframes automated log analysis as instruction following rather than task-specific modeling, unifying parsing, anomaly detection, interpretation, root cause analysis, and solution recommendation into a single, instruction-tuned model trained on cross-task and cross-domain data. It deploys on open-source foundations (LLaMA-2-7B) and demonstrates superior performance over numerous baselines across five capabilities, with strong generalization to unseen and complex instructions. The methodology combines a two-tier capability design with an instruction dataset built from diverse sources, and the training objective maximizes the likelihood of the indicated responses given instructions. Practical deployment in Huawei’s O&M platform and open data releases support real-world applicability and future extensibility for industrial log analysis.
Abstract
Automatic log analysis is essential for the efficient Operation and Maintenance (O&M) of software systems, providing critical insights into system behaviors. However, existing approaches mostly treat log analysis as training a model to perform an isolated task ( e.g., anomaly detection, log parsing, etc.) using task-specific log-label pairs. These task-based approaches are inflexible in generalizing to complex scenarios, depend on task-specific training data, and cost significantly when deploying multiple models. In this paper, we propose an instruction-based training approach that transforms log-label pairs from multiple tasks and domains into a unified format of instruction-response pairs. Our trained model, LogLM, can follow complex user instructions and generalize better across different tasks, thereby increasing flexibility and reducing the dependence on task-specific training data. By integrating major log analysis tasks into a single model, our approach also relieves model deployment burden. Experimentally, LogLM outperforms existing approaches across five log analysis capabilities, and exhibits strong generalization abilities on complex instructions and unseen tasks.
