Towards One-for-All Anomaly Detection for Tabular Data

Shiyuan Li; Yixin Liu; Yu Zheng; Xiaofeng Cao; Shirui Pan; Heng Tao Shen

Towards One-for-All Anomaly Detection for Tabular Data

Shiyuan Li, Yixin Liu, Yu Zheng, Xiaofeng Cao, Shirui Pan, Heng Tao Shen

Abstract

Tabular anomaly detection (TAD) aims to identify samples that deviate from the majority in tabular data and is critical in many real-world applications. However, existing methods follow a ``one model for one dataset (OFO)'' paradigm, which relies on dataset-specific training and thus incurs high computational cost and yields limited generalization to unseen domains. To address these limitations, we propose OFA-TAD, a generalist one-for-all (OFA) TAD framework that only requires one-time training on multiple source datasets and can generalize to unseen datasets from diverse domains on-the-fly. To realize one-for-all tabular anomaly detection, OFA-TAD extracts neighbor-distance patterns as transferable cues, and introduces multi-view neighbor-distance representations from multiple transformation-induced metric spaces to mitigate the transformation sensitivity of distance profiles. To adaptively combine multi-view distance evidence, a Mixture-of-Experts (MoE) scoring network is employed for view-specific anomaly scoring and entropy-regularized gated fusion, with a multi-strategy anomaly synthesis mechanism to support training under the one-class constraint. Extensive experiments on 34 datasets from 14 domains demonstrate that OFA-TAD achieves superior anomaly detection performance and strong cross-domain generalizability under the strict OFA setting.

Towards One-for-All Anomaly Detection for Tabular Data

Abstract

Paper Structure (27 sections, 10 equations, 10 figures, 9 tables, 2 algorithms)

This paper contains 27 sections, 10 equations, 10 figures, 9 tables, 2 algorithms.

Introduction
Preliminary
Methodology
Multi-View Distance Encoding
Mixture-of-Experts Scoring Network
Training with Multi-Strategy Anomaly Synthesis
Experiments
Experiments Setup
Performance Comparison
Ablation Studies
Robustness to Varying Context Size
Visualization
Scaling with #Source Datasets
Conclusions
Related Work
...and 12 more sections

Figures (10)

Figure 1: The difference between OFO and OFA paradigm.
Figure 2: The distance varies in different transformations, where R: Raw, S: Standardized, and Q: Quantile.
Figure 3: The pipeline of OFA-TAD. First, the multi-view distance encoding module extracts normalized neighbor-distance profiles from multiple transformation-induced metric spaces to obtain transferable representations. Then, the Mixture-of-Experts (MoE) scoring network leverages view-specific experts and an entropy-regularized gating mechanism to adaptively fuse distance-based anomaly evidence. A multi-strategy anomaly synthesis module is employed to generate diverse pseudo-anomalies for model training.
Figure 4: Average rank of different methods across 34 datasets.
Figure 5: Performance with varying context nodes.
...and 5 more figures

Towards One-for-All Anomaly Detection for Tabular Data

Abstract

Towards One-for-All Anomaly Detection for Tabular Data

Authors

Abstract

Table of Contents

Figures (10)