Table of Contents
Fetching ...

ICLAD: In-Context Learning for Unified Tabular Anomaly Detection Across Supervision Regimes

Jack Yi Wei, Narges Armanfard

Abstract

Anomaly detection on tabular data is commonly studied under three supervision regimes, including one-class settings that assume access to anomaly-free training samples, fully unsupervised settings with unlabeled and potentially contaminated training data, and semi-supervised settings with limited anomaly labels. Existing deep learning approaches typically train dataset-specific models under the assumption of a single supervision regime, which limits their ability to leverage shared structures across anomaly detection tasks and to adapt to different supervision levels. We propose ICLAD, an in-context learning foundation model for tabular anomaly detection that generalizes across both datasets and supervision regimes. ICLAD is trained via meta-learning on synthetic tabular anomaly detection tasks, and at inference time, the model assigns anomaly scores by conditioning on the training set without updating model weights. Comprehensive experiments on 57 tabular datasets from ADBench show that our method achieves state-of-the-art performance across three supervision regimes, establishing a unified framework for tabular anomaly detection.

ICLAD: In-Context Learning for Unified Tabular Anomaly Detection Across Supervision Regimes

Abstract

Anomaly detection on tabular data is commonly studied under three supervision regimes, including one-class settings that assume access to anomaly-free training samples, fully unsupervised settings with unlabeled and potentially contaminated training data, and semi-supervised settings with limited anomaly labels. Existing deep learning approaches typically train dataset-specific models under the assumption of a single supervision regime, which limits their ability to leverage shared structures across anomaly detection tasks and to adapt to different supervision levels. We propose ICLAD, an in-context learning foundation model for tabular anomaly detection that generalizes across both datasets and supervision regimes. ICLAD is trained via meta-learning on synthetic tabular anomaly detection tasks, and at inference time, the model assigns anomaly scores by conditioning on the training set without updating model weights. Comprehensive experiments on 57 tabular datasets from ADBench show that our method achieves state-of-the-art performance across three supervision regimes, establishing a unified framework for tabular anomaly detection.
Paper Structure (57 sections, 9 equations, 17 figures, 6 tables, 1 algorithm)

This paper contains 57 sections, 9 equations, 17 figures, 6 tables, 1 algorithm.

Figures (17)

  • Figure 1: Each column corresponds to a training regime. The top row shows the training dataset, and the bottom row shows the test set and the anomaly-score landscape from ICLAD after conditioning on training dataset. In the test set, orange diamonds denote ground-truth anomalies and dark circles denote ground-truth normal samples. Darker regions indicate higher anomaly scores.
  • Figure 2: Overview of the ICLAD framework and synthetic task construction. The top region depicts anomaly generation and the construction of supervision tasks. The bottom region shows the two-stage procedure of ICLAD. Left: prior-fitting on synthetic tasks. Right: inference on real datasets.
  • Figure 3: Feature interaction and t-SNE maatenVisualizingDataUsing2008 plots of normal and anomalous samples. The orange rectangles are anomalies; the blue circles are normal samples.
  • Figure 4: Boxplots of AUC-ROC across 57 datasets. Boxes show the interquartile range (IQR) with medians indicated by the center line and whiskers extending to 1.5 times IQR. Models are ordered by average AUC-ROC and color-coded by family: classical (green), deep learning (blue), and ICLAD (red).
  • Figure 5: Critical difference diagrams of average AUC-ROC ranks. Models are color coded by: classical (green), deep learning (blue) and ICLAD (red)
  • ...and 12 more figures