Semi-supervised learning via DQN for log anomaly detection

Yingying He; Xiaobing Pei

Semi-supervised learning via DQN for log anomaly detection

Yingying He, Xiaobing Pei

TL;DR

The paper tackles log anomaly detection under severe label scarcity and class imbalance by framing it as a semi-supervised reinforcement learning problem. It introduces DQNLog, which combines semantic log embedding (Drain parsing, session/window grouping, and Roberta-based embeddings) with a deep Q-network that uses an attention Bi-LSTM agent, a cosine-similarity biased state transition, and a joint external-internal reward to leverage both labeled and unlabeled data. A regularized loss blends TD-based MSE with a supervised term to retain prior knowledge, and the model is trained with a target network and experience replay. Evaluation on three real-world datasets shows DQNLog achieving superior F1 scores by effectively utilizing labeled anomalies and exploring unlabeled anomalies, outperforming traditional and semi-supervised baselines. The approach offers a practical, scalable path for robust log anomaly detection in large-scale, imbalanced log streams.

Abstract

Log anomaly detection is a critical component in modern software system security and maintenance, serving as a crucial support and basis for system monitoring, operation, and troubleshooting. It aids operations personnel in timely identification and resolution of issues. However, current methods in log anomaly detection still face challenges such as underutilization of unlabeled data, imbalance between normal and anomaly class data, and high rates of false positives and false negatives, leading to insufficient effectiveness in anomaly recognition. In this study, we propose a semi-supervised log anomaly detection method named DQNLog, which integrates deep reinforcement learning to enhance anomaly detection performance by leveraging a small amount of labeled data and large-scale unlabeled data. To address issues of imbalanced data and insufficient labeling, we design a state transition function biased towards anomalies based on cosine similarity, aiming to capture semantic-similar anomalies rather than favoring the majority class. To enhance the model's capability in learning anomalies, we devise a joint reward function that encourages the model to utilize labeled anomalies and explore unlabeled anomalies, thereby reducing false positives and false negatives. Additionally, to prevent the model from deviating from normal trajectories due to misestimation, we introduce a regularization term in the loss function to ensure the model retains prior knowledge during updates. We evaluate DQNLog on three widely used datasets, demonstrating its ability to effectively utilize large-scale unlabeled data and achieve promising results across all experimental datasets.

Semi-supervised learning via DQN for log anomaly detection

TL;DR

Abstract

Semi-supervised learning via DQN for log anomaly detection

Authors

TL;DR

Abstract

Table of Contents

Figures (7)