Table of Contents
Fetching ...

Automated Neural Architecture Design for Industrial Defect Detection

Yuxi Liu, Yunfeng Ma, Yi Tang, Min Liu, Shuai Jiang, Yaonan Wang

TL;DR

AutoNAD advances industrial surface defect detection by automating neural architecture design through a unified hybrid search space that jointly optimizes convolution, transformer, and MLP operators. It introduces cross weight sharing to efficiently train heterogeneous subnets, a searchable multi-level feature aggregation module for robust multi-scale fusion, and a latency-aware prior to balance accuracy with deployment efficiency. The approach yields state-of-the-art mIoU and mF1 on three defect datasets and demonstrates practical viability via integration into an aero-engine blade inspection platform with edge-friendly latency. This work enables automated, deployment-ready NAS tailored to the challenging domain of industrial inspection, reducing manual design effort while delivering high-precision defect detection.

Abstract

Industrial surface defect detection (SDD) is critical for ensuring product quality and manufacturing reliability. Due to the diverse shapes and sizes of surface defects, SDD faces two main challenges: intraclass difference and interclass similarity. Existing methods primarily utilize manually designed models, which require extensive trial and error and often struggle to address both challenges effectively. To overcome this, we propose AutoNAD, an automated neural architecture design framework for SDD that jointly searches over convolutions, transformers, and multi-layer perceptrons. This hybrid design enables the model to capture both fine-grained local variations and long-range semantic context, addressing the two key challenges while reducing the cost of manual network design. To support efficient training of such a diverse search space, AutoNAD introduces a cross weight sharing strategy, which accelerates supernet convergence and improves subnet performance. Additionally, a searchable multi-level feature aggregation module (MFAM) is integrated to enhance multi-scale feature learning. Beyond detection accuracy, runtime efficiency is essential for industrial deployment. To this end, AutoNAD incorporates a latency-aware prior to guide the selection of efficient architectures. The effectiveness of AutoNAD is validated on three industrial defect datasets and further applied within a defect imaging and detection platform. Code is available at https://github.com/Yuxi104/AutoNAD.

Automated Neural Architecture Design for Industrial Defect Detection

TL;DR

AutoNAD advances industrial surface defect detection by automating neural architecture design through a unified hybrid search space that jointly optimizes convolution, transformer, and MLP operators. It introduces cross weight sharing to efficiently train heterogeneous subnets, a searchable multi-level feature aggregation module for robust multi-scale fusion, and a latency-aware prior to balance accuracy with deployment efficiency. The approach yields state-of-the-art mIoU and mF1 on three defect datasets and demonstrates practical viability via integration into an aero-engine blade inspection platform with edge-friendly latency. This work enables automated, deployment-ready NAS tailored to the challenging domain of industrial inspection, reducing manual design effort while delivering high-precision defect detection.

Abstract

Industrial surface defect detection (SDD) is critical for ensuring product quality and manufacturing reliability. Due to the diverse shapes and sizes of surface defects, SDD faces two main challenges: intraclass difference and interclass similarity. Existing methods primarily utilize manually designed models, which require extensive trial and error and often struggle to address both challenges effectively. To overcome this, we propose AutoNAD, an automated neural architecture design framework for SDD that jointly searches over convolutions, transformers, and multi-layer perceptrons. This hybrid design enables the model to capture both fine-grained local variations and long-range semantic context, addressing the two key challenges while reducing the cost of manual network design. To support efficient training of such a diverse search space, AutoNAD introduces a cross weight sharing strategy, which accelerates supernet convergence and improves subnet performance. Additionally, a searchable multi-level feature aggregation module (MFAM) is integrated to enhance multi-scale feature learning. Beyond detection accuracy, runtime efficiency is essential for industrial deployment. To this end, AutoNAD incorporates a latency-aware prior to guide the selection of efficient architectures. The effectiveness of AutoNAD is validated on three industrial defect datasets and further applied within a defect imaging and detection platform. Code is available at https://github.com/Yuxi104/AutoNAD.

Paper Structure

This paper contains 31 sections, 22 equations, 9 figures, 8 tables, 1 algorithm.

Figures (9)

  • Figure 1: Challenges of surface defect detection. (a) Intraclass difference. The shapes of crack defects vary significantly. (b) Interclass similarity. Blowhole and break defects exhibit considerable similarities in pattern.
  • Figure 2: The automated defect imaging and detection platform. It consists of two main parts: imaging and detection.
  • Figure 3: The supernet architecture of the AutoNAD. The whole architecture is divided into two main parts: backbone and multi-level feature aggregation module. Each block's channel dimension is dynamically determined as part of the search process. Specifically, in the visualization, each block is depicted as a horizontal bar, with its total length representing the maximum channel width. The solid portion corresponds to the selected active channels, while the dashed portion indicates the unselected ones. Moreover, the depth of each block is also dynamic. Refer to Table \ref{['search space']} and Sec. \ref{['mlfg search space']} for more details about the search space.
  • Figure 4: (a) Classical weight sharing. (b) Cross weight sharing for different types of operators. Each block is depicted as a horizontal bar, with its total length representing the maximum channel width. The solid portion corresponds to the selected active channels, while the dashed portion indicates the unselected ones.
  • Figure 5: Latency-aware prior constructed from runtime statistics during supernet training. Each cell corresponds to a specific operator at a specific block location, with warmer colors indicating lower average latency and higher sampling probability.
  • ...and 4 more figures