COFT-AD: COntrastive Fine-Tuning for Few-Shot Anomaly Detection
Jingyi Liao, Xun Xu, Manh Cuong Nguyen, Adam Goodge, Chuan Sheng Foo
TL;DR
COFT-AD tackles few-shot anomaly detection by transferring a pretrained backbone and adapting it to the target domain through contrastive fine-tuning, addressing covariate shift with target-domain representations. It introduces a cross-instance positive pair loss to foster tight normal clusters and an optional negative pair loss to separate synthesized anomalies when prior anomaly knowledge is available, composing them into a unified objective ${L}_{all}={L}_{Con}+\lambda_{PP}{L}_{PP}+\lambda_{NP}{L}_{NP}$. After learning, a density-based anomaly score is computed by Gaussian-fitting $N_A$ augmented, $L2$-normalized embeddings and measuring the Mahalanobis distance $d_{AS}$, enabling robust anomaly detection from few normal examples. Empirical results across 3 controlled and 4 real-world industrial datasets show competitive or state-of-the-art performance, with ablations confirming the benefits of contrastive adaptation, cross-instance positives, and conditional negatives depending on anomaly simulability. The approach offers a practical, data-efficient pathway for deploying anomaly detectors in settings with limited clean data and varying anomaly types.
Abstract
Existing approaches towards anomaly detection~(AD) often rely on a substantial amount of anomaly-free data to train representation and density models. However, large anomaly-free datasets may not always be available before the inference stage; in which case an anomaly detection model must be trained with only a handful of normal samples, a.k.a. few-shot anomaly detection (FSAD). In this paper, we propose a novel methodology to address the challenge of FSAD which incorporates two important techniques. Firstly, we employ a model pre-trained on a large source dataset to initialize model weights. Secondly, to ameliorate the covariate shift between source and target domains, we adopt contrastive training to fine-tune on the few-shot target domain data. To learn suitable representations for the downstream AD task, we additionally incorporate cross-instance positive pairs to encourage a tight cluster of the normal samples, and negative pairs for better separation between normal and synthesized negative samples. We evaluate few-shot anomaly detection on on 3 controlled AD tasks and 4 real-world AD tasks to demonstrate the effectiveness of the proposed method.
