Table of Contents
Fetching ...

From Zero to Hero: Cold-Start Anomaly Detection

Tal Reiss, George Kour, Naama Zwerdling, Ateret Anaby-Tavor, Yedid Hoshen

TL;DR

The paper defines a cold-start anomaly detection setting in which a detector must operate with only zero-shot guidance and a short stream of potentially contaminated observations. It introduces ColdFusion, a domain-adaptive method that assigns observations to the nearest zero-shot class descriptors and updates class embeddings via a median of the descriptor and assigned observations, producing an adapted anomaly score $S_{adapt}(x)=\min_k d(\phi(x), z_k)$. An evaluation suite with the $\text{AUC}^2_{\tilde{t}}$ metric and datasets Banking77-OOS and CLINC-OOS demonstrates that ColdFusion consistently outperforms pure zero-shot and DN2 baselines across contamination levels and encoders, including when only a small fraction of data is available. The work highlights practical impact for early deployment of OOS detection in chatbots and similar systems and provides baselines and protocols to foster future research in this setting.

Abstract

When first deploying an anomaly detection system, e.g., to detect out-of-scope queries in chatbots, there are no observed data, making data-driven approaches ineffective. Zero-shot anomaly detection methods offer a solution to such "cold-start" cases, but unfortunately they are often not accurate enough. This paper studies the realistic but underexplored cold-start setting where an anomaly detection model is initialized using zero-shot guidance, but subsequently receives a small number of contaminated observations (namely, that may include anomalies). The goal is to make efficient use of both the zero-shot guidance and the observations. We propose ColdFusion, a method that effectively adapts the zero-shot anomaly detector to contaminated observations. To support future development of this new setting, we propose an evaluation suite consisting of evaluation protocols and metrics.

From Zero to Hero: Cold-Start Anomaly Detection

TL;DR

The paper defines a cold-start anomaly detection setting in which a detector must operate with only zero-shot guidance and a short stream of potentially contaminated observations. It introduces ColdFusion, a domain-adaptive method that assigns observations to the nearest zero-shot class descriptors and updates class embeddings via a median of the descriptor and assigned observations, producing an adapted anomaly score . An evaluation suite with the metric and datasets Banking77-OOS and CLINC-OOS demonstrates that ColdFusion consistently outperforms pure zero-shot and DN2 baselines across contamination levels and encoders, including when only a small fraction of data is available. The work highlights practical impact for early deployment of OOS detection in chatbots and similar systems and provides baselines and protocols to foster future research in this setting.

Abstract

When first deploying an anomaly detection system, e.g., to detect out-of-scope queries in chatbots, there are no observed data, making data-driven approaches ineffective. Zero-shot anomaly detection methods offer a solution to such "cold-start" cases, but unfortunately they are often not accurate enough. This paper studies the realistic but underexplored cold-start setting where an anomaly detection model is initialized using zero-shot guidance, but subsequently receives a small number of contaminated observations (namely, that may include anomalies). The goal is to make efficient use of both the zero-shot guidance and the observations. We propose ColdFusion, a method that effectively adapts the zero-shot anomaly detector to contaminated observations. To support future development of this new setting, we propose an evaluation suite consisting of evaluation protocols and metrics.
Paper Structure (16 sections, 5 equations, 6 figures, 6 tables, 2 algorithms)

This paper contains 16 sections, 5 equations, 6 figures, 6 tables, 2 algorithms.

Figures (6)

  • Figure 1: ColdFusion assigns each of the $t$ observations to their nearest class, then adapts the embeddings of each class towards the assigned observations.
  • Figure 2: Performance trends with contamination $r=5\%$ using the GTE model over time demonstrate the superiority of our ColdFusion method over other baseline approaches.
  • Figure 3: Performance trends with contamination $r=2.5\%$ using the MPNET model over time.
  • Figure 4: Performance trends with contamination $r=5\%$ using the MPNET model over time.
  • Figure 5: Performance trends with contamination $r=7.5\%$ using the MPNET model over time.
  • ...and 1 more figures