KKA: Improving Vision Anomaly Detection through Anomaly-related Knowledge from Large Language Models
Dong Chen, Zhengqing Hu, Peiguang Fan, Yueting Zhuang, Yafei Li, Qidong Liu, Xiaoheng Jiang, Mingliang Xu
TL;DR
This work addresses the challenge of unsupervised vision anomaly detection by introducing Key Knowledge Augmentation (KKA), which extracts anomaly-related knowledge from large language models to generate plausible, boundary-shaping anomalies tailored to normal samples. By classifying generated anomalies into easy and hard via a confusion evaluator and iteratively enriching hard anomalies through LLM fine-tuning with Direct Preference Optimization, KKA guides detectors to learn a clearer boundary between normal and anomalous data with modest sample generation overhead. Across CIFAR-100, Oxford-102, and UCM-Caption, KKA consistently enhances AUC for multiple detectors, notably elevating SimpleNet AUC on CIFAR-100 from 74.62% to 84.04%. The method demonstrates strong generality, yielding improvements even when integrated with different anomaly detectors and settings, and offers practical benefits through reduced reliance on purely random anomaly generation.
Abstract
Vision anomaly detection, particularly in unsupervised settings, often struggles to distinguish between normal samples and anomalies due to the wide variability in anomalies. Recently, an increasing number of studies have focused on generating anomalies to help detectors learn more effective boundaries between normal samples and anomalies. However, as the generated anomalies are often derived from random factors, they frequently lack realism. Additionally, randomly generated anomalies typically offer limited support in constructing effective boundaries, as most differ substantially from normal samples and lie far from the boundary. To address these challenges, we propose Key Knowledge Augmentation (KKA), a method that extracts anomaly-related knowledge from large language models (LLMs). More specifically, KKA leverages the extensive prior knowledge of LLMs to generate meaningful anomalies based on normal samples. Then, KKA classifies the generated anomalies as easy anomalies and hard anomalies according to their similarity to normal samples. Easy anomalies exhibit significant differences from normal samples, whereas hard anomalies closely resemble normal samples. KKA iteratively updates the generated anomalies, and gradually increasing the proportion of hard anomalies to enable the detector to learn a more effective boundary. Experimental results show that the proposed method significantly improves the performance of various vision anomaly detectors while maintaining low generation costs. The code for CMG can be found at https://github.com/Anfeather/KKA.
