TreeMIL: A Multi-instance Learning Framework for Time Series Anomaly Detection with Inexact Supervision
Chen Liu, Shibo He, Haoyu Liu, Shizhong Li
TL;DR
The paper tackles time series anomaly detection with inexact supervision by introducing TreeMIL, a multi-instance learning framework that represents the entire series as an N-ary tree to capture both point and collective anomalies. It combines a multi-resolution temporal embedding with attention-based feature extraction across tree nodes and a global anomaly discriminator to produce point-level predictions from bag-level labels. Empirical results on seven public datasets show TreeMIL significantly outperforms unsupervised and weakly supervised baselines, with notable gains in F1-D and IoU, highlighting its effectiveness in identifying collective anomaly patterns. The approach offers a practical, robust solution for TSAD where fine-grained labels are costly or noisy, and provides insights into multi-scale anomaly patterns via visualization and ablations.
Abstract
Time series anomaly detection (TSAD) plays a vital role in various domains such as healthcare, networks, and industry. Considering labels are crucial for detection but difficult to obtain, we turn to TSAD with inexact supervision: only series-level labels are provided during the training phase, while point-level anomalies are predicted during the testing phase. Previous works follow a traditional multi-instance learning (MIL) approach, which focuses on encouraging high anomaly scores at individual time steps. However, time series anomalies are not only limited to individual point anomalies, they can also be collective anomalies, typically exhibiting abnormal patterns over subsequences. To address the challenge of collective anomalies, in this paper, we propose a tree-based MIL framework (TreeMIL). We first adopt an N-ary tree structure to divide the entire series into multiple nodes, where nodes at different levels represent subsequences with different lengths. Then, the subsequence features are extracted to determine the presence of collective anomalies. Finally, we calculate point-level anomaly scores by aggregating features from nodes at different levels. Experiments conducted on seven public datasets and eight baselines demonstrate that TreeMIL achieves an average 32.3% improvement in F1- score compared to previous state-of-the-art methods. The code is available at https://github.com/fly-orange/TreeMIL.
