Contextual Learning for Anomaly Detection in Tabular Data
Spencer King, Zhilu Zhang, Ruofan Yu, Baris Coskun, Wei Ding, Qian Cui
TL;DR
This work tackles unsupervised anomaly detection in heterogeneous tabular data by introducing contextual learning, which models conditional distributions $P(\mathbf{Y} \mid \mathbf{C})$ instead of a global joint $P(\mathbf{X})$. It presents a probabilistic formulation with variance-discriminative grounding, a bilevel optimization strategy for automatic context feature selection using early validation loss, and a practical conditional Wasserstein autoencoder (CWAE) to model context-conditioned content. Empirically, contextual modeling yields consistent gains across eight diverse datasets, often surpassing state-of-the-art unconditional methods and approaching an optimal-context upper bound. The framework offers scalable, context-aware anomaly detection with per-context thresholds, enabling robust performance in real-world, heterogeneous environments and laying groundwork for future multi-context and multimodal extensions.
Abstract
Anomaly detection is critical in domains such as cybersecurity and finance, especially when working with large-scale tabular data. Yet, unsupervised anomaly detection-where no labeled anomalies are available-remains challenging because traditional deep learning methods model a single global distribution, assuming all samples follow the same behavior. In contrast, real-world data often contain heterogeneous contexts (e.g., different users, accounts, or devices), where globally rare events may be normal within specific conditions. We introduce a contextual learning framework that explicitly models how normal behavior varies across contexts by learning conditional data distributions $P(\mathbf{Y} \mid \mathbf{C})$ rather than a global joint distribution $P(\mathbf{X})$. The framework encompasses (1) a probabilistic formulation for context-conditioned learning, (2) a principled bilevel optimization strategy for automatically selecting informative context features using early validation loss, and (3) theoretical grounding through variance decomposition and discriminative learning principles. We instantiate this framework using a novel conditional Wasserstein autoencoder as a simple yet effective model for tabular anomaly detection. Extensive experiments across eight benchmark datasets demonstrate that contextual learning consistently outperforms global approaches-even when the optimal context is not intuitively obvious-establishing a new foundation for anomaly detection in heterogeneous tabular data.
