Table of Contents
Fetching ...

Weakly Supervised Anomaly Detection: A Survey

Minqi Jiang, Chaochuan Hou, Ao Zheng, Xiyang Hu, Songqiao Han, Hailiang Huang, Xiangnan He, Philip S. Yu, Yue Zhao

TL;DR

This survey defines weakly supervised anomaly detection (WSAD) by three settings—incomplete, inexact, and inaccurate supervision—and surveys methods across tabular, graph, time-series, and image/video data. It organizes approaches into key algorithmic families (representation learning, anomaly scoring, MIL, label propagation, and ensembles) and highlights open questions for each setting, including label acquisition, few-shot learning, and cross-modality transfer. The work provides formal definitions, comparative analyses, and practical resources, including released code and datasets, to spur further progress. Overall, WSAD remains most developed for incomplete supervision, with substantial opportunities to extend methods to inexact and inaccurate supervision and to broader data modalities. The paper emphasizes the potential of active learning, meta-learning, and SSL-based denoising to address label scarcity and noise in real-world AD tasks.

Abstract

Anomaly detection (AD) is a crucial task in machine learning with various applications, such as detecting emerging diseases, identifying financial frauds, and detecting fake news. However, obtaining complete, accurate, and precise labels for AD tasks can be expensive and challenging due to the cost and difficulties in data annotation. To address this issue, researchers have developed AD methods that can work with incomplete, inexact, and inaccurate supervision, collectively summarized as weakly supervised anomaly detection (WSAD) methods. In this study, we present the first comprehensive survey of WSAD methods by categorizing them into the above three weak supervision settings across four data modalities (i.e., tabular, graph, time-series, and image/video data). For each setting, we provide formal definitions, key algorithms, and potential future directions. To support future research, we conduct experiments on a selected setting and release the source code, along with a collection of WSAD methods and data.

Weakly Supervised Anomaly Detection: A Survey

TL;DR

This survey defines weakly supervised anomaly detection (WSAD) by three settings—incomplete, inexact, and inaccurate supervision—and surveys methods across tabular, graph, time-series, and image/video data. It organizes approaches into key algorithmic families (representation learning, anomaly scoring, MIL, label propagation, and ensembles) and highlights open questions for each setting, including label acquisition, few-shot learning, and cross-modality transfer. The work provides formal definitions, comparative analyses, and practical resources, including released code and datasets, to spur further progress. Overall, WSAD remains most developed for incomplete supervision, with substantial opportunities to extend methods to inexact and inaccurate supervision and to broader data modalities. The paper emphasizes the potential of active learning, meta-learning, and SSL-based denoising to address label scarcity and noise in real-world AD tasks.

Abstract

Anomaly detection (AD) is a crucial task in machine learning with various applications, such as detecting emerging diseases, identifying financial frauds, and detecting fake news. However, obtaining complete, accurate, and precise labels for AD tasks can be expensive and challenging due to the cost and difficulties in data annotation. To address this issue, researchers have developed AD methods that can work with incomplete, inexact, and inaccurate supervision, collectively summarized as weakly supervised anomaly detection (WSAD) methods. In this study, we present the first comprehensive survey of WSAD methods by categorizing them into the above three weak supervision settings across four data modalities (i.e., tabular, graph, time-series, and image/video data). For each setting, we provide formal definitions, key algorithms, and potential future directions. To support future research, we conduct experiments on a selected setting and release the source code, along with a collection of WSAD methods and data.
Paper Structure (12 sections, 1 equation, 3 figures, 2 tables)

This paper contains 12 sections, 1 equation, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Category of incomplete supervision.
  • Figure 2: Category of inexact supervision.
  • Figure 3: Category of inaccurate supervision.