Table of Contents
Fetching ...

ADNet: A Large-Scale and Extensible Multi-Domain Benchmark for Anomaly Detection Across 380 Real-World Categories

Hai Ling, Jia Guo, Zhulin Tao, Yunkang Cao, Donglin Di, Hongyan Xu, Xiu Su, Yang Song, Lei Fan

TL;DR

ADNet provides a large-scale, cross-domain anomaly-detection benchmark with 380 categories across five real-world domains and 196,294 RGB images, standardizing data into MVTec-style annotations and adding rich textual defect descriptions. The work reveals substantial scalability gaps for existing AD methods when moving from single-category to multi-category settings and across domains, with I-AUROC dropping from ~90% to ~78%, and introduces Dinomaly^m, a context-guided mixture-of-experts baseline that improves I-AUROC to 83.2% and P-AUROC to 93.1% on ADNet. Zero-shot cross-domain transfer remains challenging, underscoring the need for more transferable and context-aware anomaly representations, while few-shot experiments demonstrate data-efficient capabilities for multi-class models. Overall, ADNet establishes a standardized, extensible platform to push toward scalable anomaly-detection foundation models across diverse domains.

Abstract

Anomaly detection (AD) aims to identify defects using normal-only training data. Existing anomaly detection benchmarks (e.g., MVTec-AD with 15 categories) cover only a narrow range of categories, limiting the evaluation of cross-context generalization and scalability. We introduce ADNet, a large-scale, multi-domain benchmark comprising 380 categories aggregated from 49 publicly available datasets across Electronics, Industry, Agrifood, Infrastructure, and Medical domains. The benchmark includes a total of 196,294 RGB images, consisting of 116,192 normal samples for training and 80,102 test images, of which 60,311 are anomalous. All images are standardized with MVTec-style pixel-level annotations and structured text descriptions spanning both spatial and visual attributes, enabling multimodal anomaly detection tasks. Extensive experiments reveal a clear scalability challenge: existing state-of-the-art methods achieve 90.6% I-AUROC in one-for-one settings but drop to 78.5% when scaling to all 380 categories in a multi-class setting. To address this, we propose Dinomaly-m, a context-guided Mixture-of-Experts extension of Dinomaly that expands decoder capacity without increasing inference cost. It achieves 83.2% I-AUROC and 93.1% P-AUROC, demonstrating superior performance over existing approaches. ADNet is designed as a standardized and extensible benchmark, supporting the community in expanding anomaly detection datasets across diverse domains and providing a scalable foundation for future anomaly detection foundation models. Dataset: https://grainnet.github.io/ADNet

ADNet: A Large-Scale and Extensible Multi-Domain Benchmark for Anomaly Detection Across 380 Real-World Categories

TL;DR

ADNet provides a large-scale, cross-domain anomaly-detection benchmark with 380 categories across five real-world domains and 196,294 RGB images, standardizing data into MVTec-style annotations and adding rich textual defect descriptions. The work reveals substantial scalability gaps for existing AD methods when moving from single-category to multi-category settings and across domains, with I-AUROC dropping from ~90% to ~78%, and introduces Dinomaly^m, a context-guided mixture-of-experts baseline that improves I-AUROC to 83.2% and P-AUROC to 93.1% on ADNet. Zero-shot cross-domain transfer remains challenging, underscoring the need for more transferable and context-aware anomaly representations, while few-shot experiments demonstrate data-efficient capabilities for multi-class models. Overall, ADNet establishes a standardized, extensible platform to push toward scalable anomaly-detection foundation models across diverse domains.

Abstract

Anomaly detection (AD) aims to identify defects using normal-only training data. Existing anomaly detection benchmarks (e.g., MVTec-AD with 15 categories) cover only a narrow range of categories, limiting the evaluation of cross-context generalization and scalability. We introduce ADNet, a large-scale, multi-domain benchmark comprising 380 categories aggregated from 49 publicly available datasets across Electronics, Industry, Agrifood, Infrastructure, and Medical domains. The benchmark includes a total of 196,294 RGB images, consisting of 116,192 normal samples for training and 80,102 test images, of which 60,311 are anomalous. All images are standardized with MVTec-style pixel-level annotations and structured text descriptions spanning both spatial and visual attributes, enabling multimodal anomaly detection tasks. Extensive experiments reveal a clear scalability challenge: existing state-of-the-art methods achieve 90.6% I-AUROC in one-for-one settings but drop to 78.5% when scaling to all 380 categories in a multi-class setting. To address this, we propose Dinomaly-m, a context-guided Mixture-of-Experts extension of Dinomaly that expands decoder capacity without increasing inference cost. It achieves 83.2% I-AUROC and 93.1% P-AUROC, demonstrating superior performance over existing approaches. ADNet is designed as a standardized and extensible benchmark, supporting the community in expanding anomaly detection datasets across diverse domains and providing a scalable foundation for future anomaly detection foundation models. Dataset: https://grainnet.github.io/ADNet

Paper Structure

This paper contains 25 sections, 3 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Overview of ADNet, covering 380 categories across five application domains.
  • Figure 2: ADNet is constructed by consolidating 49 publicly available AD datasets. We validate their composition and quality, and hierarchically integrate all sources into five target domains: Electronics, Industry, Agrifood, Infrastructure, and Medical.
  • Figure 3: ADNet construction pipeline. We collect datasets from diverse sources, standardize formats, balance category-wise samples, and refine annotations with fine-grained defect type labels.
  • Figure 4: ADNet domain summary and text annotations.(a) Total image counts of the five domains with normal vs. anomaly proportions. (b) Top-5 defect types and a text-annotation example (image, mask, and JSON fields).
  • Figure 5: Illustration of the key challenges introduced by ADNet. a) Category Catastrophe: existing methods exhibit significant performance degradation on ADNet. b) Context-Dependent Anomalies: "spots" are good on mint tablets but bad contamination in a pill. c) Can Anomalies Transfer?, ADNet spans multiple domains, requiring models to learn transferable feature representations.
  • ...and 3 more figures