Table of Contents
Fetching ...

A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection

Alejandro Garnung Menéndez

TL;DR

This work tackles robust defect detection in highly noisy industrial imagery by distinguishing surface defects from noise using a hybrid framework that combines statistical feature selection with a broad set of handcrafted features. It generates ROI scores through a two-stage process: (i) extensive feature extraction across spatial, texture, distributional, and spectral domains, and (ii) rigorous statistical feature selection (Fisher criterion, KS test, t-test, Bhattacharyya distance) to retain discriminative features, potentially aggregating them with simple scoring or a random forest. A diverse toolbox—including GMMs, patch-based tests, IQR-based outlier detection, CC analysis, Gabor/LBP/HOG/Homomorphic/GLE textures, and median-histogram modeling—drives robust TP/FP separation while aiming to minimize false positives in challenging, noisy environments. The framework is designed to function as a black-box module on top of existing classifiers or as a standalone assessment unit, offering real-time applicability and flexible integration with industrial inspection pipelines.

Abstract

In industrial imaging, accurately detecting and distinguishing surface defects from noise is critical and challenging, particularly in complex environments with noisy data. This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy while minimizing false positives. The motivation of the system is based on the generation of scalar scores that represent the likelihood that a region of interest (ROI) is classified as a defect or noise. We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods such as Fisher separation, chi-squared test, and variance analysis. These techniques identify the most discriminative features, focusing on maximizing the separation between true defects and noise. Fisher's criterion ensures robust, real-time performance for automated systems. This statistical framework opens up multiple avenues for application, functioning as a standalone assessment module or as an a posteriori enhancement to machine learning classifiers. The framework can be implemented as a black-box module that applies to existing classifiers, providing an adaptable layer of quality control and optimizing predictions by leveraging intuitive feature extraction strategies, emphasizing the rationale behind feature significance and the statistical rigor of feature selection. By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications, especially in complex, noisy environments.

A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection

TL;DR

This work tackles robust defect detection in highly noisy industrial imagery by distinguishing surface defects from noise using a hybrid framework that combines statistical feature selection with a broad set of handcrafted features. It generates ROI scores through a two-stage process: (i) extensive feature extraction across spatial, texture, distributional, and spectral domains, and (ii) rigorous statistical feature selection (Fisher criterion, KS test, t-test, Bhattacharyya distance) to retain discriminative features, potentially aggregating them with simple scoring or a random forest. A diverse toolbox—including GMMs, patch-based tests, IQR-based outlier detection, CC analysis, Gabor/LBP/HOG/Homomorphic/GLE textures, and median-histogram modeling—drives robust TP/FP separation while aiming to minimize false positives in challenging, noisy environments. The framework is designed to function as a black-box module on top of existing classifiers or as a standalone assessment unit, offering real-time applicability and flexible integration with industrial inspection pipelines.

Abstract

In industrial imaging, accurately detecting and distinguishing surface defects from noise is critical and challenging, particularly in complex environments with noisy data. This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy while minimizing false positives. The motivation of the system is based on the generation of scalar scores that represent the likelihood that a region of interest (ROI) is classified as a defect or noise. We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods such as Fisher separation, chi-squared test, and variance analysis. These techniques identify the most discriminative features, focusing on maximizing the separation between true defects and noise. Fisher's criterion ensures robust, real-time performance for automated systems. This statistical framework opens up multiple avenues for application, functioning as a standalone assessment module or as an a posteriori enhancement to machine learning classifiers. The framework can be implemented as a black-box module that applies to existing classifiers, providing an adaptable layer of quality control and optimizing predictions by leveraging intuitive feature extraction strategies, emphasizing the rationale behind feature significance and the statistical rigor of feature selection. By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications, especially in complex, noisy environments.

Paper Structure

This paper contains 34 sections, 106 equations, 17 figures.

Figures (17)

  • Figure 1: a, c) Luminance images of scanned parts. b, d) Corresponding distance images. These images are highly noisy, making it impossible to visually discern from this data whether they are defective or not.
  • Figure 2: a-f) All images are defective but were classified as noisy by a classification model.
  • Figure 3: a-f) All images are noisy but were classified as defective by a classification model.
  • Figure 4: Coarse-to-fine patch-based analysis of IQR neighborhoods in candidate. a-d) Heatmaps highlighting most probable noisy regions. e-h) Metric value computation per patch.
  • Figure 5: Two GMM fitted to FP and TP median histograms.
  • ...and 12 more figures