Table of Contents
Fetching ...

ContextMix: A context-aware data augmentation method for industrial visual inspection systems

Hyungmin Kim, Donghun Kim, Pyunghwan Ahn, Sungho Suh, Hansang Cho, Junmo Kim

TL;DR

ContextMix introduces a context-aware data augmentation that pastes a resized full image into another image within a batch, enabling simultaneous learning of object and contextual information with minimal computational cost. It outperforms standard regional dropout methods on benchmark datasets and yields notable gains on an imbalanced industrial MLCC dataset, including improvements in macro F1 and competitive Top-1 accuracy. The method demonstrates strong transferability to detection and segmentation tasks and improves robustness and calibration under adversarial and challenging data, highlighting practical value for industrial inspection systems. Overall, ContextMix offers a simple yet effective augmentation strategy that enhances performance and robustness in real-world manufacturing environments while maintaining compatibility with existing regularization techniques.

Abstract

While deep neural networks have achieved remarkable performance, data augmentation has emerged as a crucial strategy to mitigate overfitting and enhance network performance. These techniques hold particular significance in industrial manufacturing contexts. Recently, image mixing-based methods have been introduced, exhibiting improved performance on public benchmark datasets. However, their application to industrial tasks remains challenging. The manufacturing environment generates massive amounts of unlabeled data on a daily basis, with only a few instances of abnormal data occurrences. This leads to severe data imbalance. Thus, creating well-balanced datasets is not straightforward due to the high costs associated with labeling. Nonetheless, this is a crucial step for enhancing productivity. For this reason, we introduce ContextMix, a method tailored for industrial applications and benchmark datasets. ContextMix generates novel data by resizing entire images and integrating them into other images within the batch. This approach enables our method to learn discriminative features based on varying sizes from resized images and train informative secondary features for object recognition using occluded images. With the minimal additional computation cost of image resizing, ContextMix enhances performance compared to existing augmentation techniques. We evaluate its effectiveness across classification, detection, and segmentation tasks using various network architectures on public benchmark datasets. Our proposed method demonstrates improved results across a range of robustness tasks. Its efficacy in real industrial environments is particularly noteworthy, as demonstrated using the passive component dataset.

ContextMix: A context-aware data augmentation method for industrial visual inspection systems

TL;DR

ContextMix introduces a context-aware data augmentation that pastes a resized full image into another image within a batch, enabling simultaneous learning of object and contextual information with minimal computational cost. It outperforms standard regional dropout methods on benchmark datasets and yields notable gains on an imbalanced industrial MLCC dataset, including improvements in macro F1 and competitive Top-1 accuracy. The method demonstrates strong transferability to detection and segmentation tasks and improves robustness and calibration under adversarial and challenging data, highlighting practical value for industrial inspection systems. Overall, ContextMix offers a simple yet effective augmentation strategy that enhances performance and robustness in real-world manufacturing environments while maintaining compatibility with existing regularization techniques.

Abstract

While deep neural networks have achieved remarkable performance, data augmentation has emerged as a crucial strategy to mitigate overfitting and enhance network performance. These techniques hold particular significance in industrial manufacturing contexts. Recently, image mixing-based methods have been introduced, exhibiting improved performance on public benchmark datasets. However, their application to industrial tasks remains challenging. The manufacturing environment generates massive amounts of unlabeled data on a daily basis, with only a few instances of abnormal data occurrences. This leads to severe data imbalance. Thus, creating well-balanced datasets is not straightforward due to the high costs associated with labeling. Nonetheless, this is a crucial step for enhancing productivity. For this reason, we introduce ContextMix, a method tailored for industrial applications and benchmark datasets. ContextMix generates novel data by resizing entire images and integrating them into other images within the batch. This approach enables our method to learn discriminative features based on varying sizes from resized images and train informative secondary features for object recognition using occluded images. With the minimal additional computation cost of image resizing, ContextMix enhances performance compared to existing augmentation techniques. We evaluate its effectiveness across classification, detection, and segmentation tasks using various network architectures on public benchmark datasets. Our proposed method demonstrates improved results across a range of robustness tasks. Its efficacy in real industrial environments is particularly noteworthy, as demonstrated using the passive component dataset.
Paper Structure (24 sections, 4 equations, 11 figures, 10 tables, 1 algorithm)

This paper contains 24 sections, 4 equations, 11 figures, 10 tables, 1 algorithm.

Figures (11)

  • Figure 1: The machines and images are utilized by Samsung Electro-mechanics for vision inspections, and reported in suh2017automatic
  • Figure 2: Overview of CutMix yun2019cutmix, PuzzleMix kim2020puzzle, and ContextMix.
  • Figure 3: Resize ratio: $W$ and $H$ are the original image size, $w$ and $h$ are the cropped size. The image $x^{\prime}$ is resized as $\mathit{W^{\prime}}$ and $\mathit{H^{\prime}}$. Resize rate, $\mathit{\epsilon}$ is defined as $(W^{\prime}\times H^{\prime})/(W\times H)$. For ContextMix, $\mathit{W^{\prime}}$ is set to $\mathit{w}$ and $\mathit{H^{\prime}}$ is set to $\mathit{h}$.
  • Figure 4: Overview of the component visual inspection system. The image dataset is intrinsically long-tailed data distribution. From this perspective, ContextMix generates new image data using existing labeled data in an inexpensive and efficient approach. The well-trained model by pre-processed and augmented data inference defective objects on each surface from acquired images in real-time.
  • Figure 5: Industrial dataset from Samsung Electro-Mechanics (SEMCO): (a) Normal MLCC images. (b) The dataset imbalanced distribution. Class 1 denotes normal MLCC products.
  • ...and 6 more figures