Table of Contents
Fetching ...

Vortex Feature Positioning: Bridging Tabular IIoT Data and Image-Based Deep Learning

Jong-Ik Park, Sihoon Seong, JunKyu Lee, Cheol-Ho Hong

TL;DR

VFP introduces a correlation-driven, vortex-based method to convert high-dimensional IIoT tabular data into images tailored for CNNs, addressing overfitting and inefficiency of fixed-size representations. By embedding features with consideration of convolution operations and arranging them in a center-out vortex based on PCC, VFP yields flexible image sizes proportional to attribute count and improves generalization. Theoretical analysis links the structure to favorable optimization properties, including a convergence rate for SGD on VFP-generated data. Empirical evaluation on seven datasets shows VFP outperforms traditional tree-based methods and existing image-conversion approaches, underscoring its practical impact for scalable IIoT analytics.

Abstract

Tabular data from IIoT devices are typically analyzed using decision tree-based machine learning techniques, which struggle with high-dimensional and numeric data. To overcome these limitations, techniques converting tabular data into images have been developed, leveraging the strengths of image-based deep learning approaches such as Convolutional Neural Networks. These methods cluster similar features into distinct image areas with fixed sizes, regardless of the number of features, resembling actual photographs. However, this increases the possibility of overfitting, as similar features, when selected carefully in a tabular format, are often discarded to prevent this issue. Additionally, fixed image sizes can lead to wasted pixels with fewer features, resulting in computational inefficiency. We introduce Vortex Feature Positioning (VFP) to address these issues. VFP arranges features based on their correlation, spacing similar ones in a vortex pattern from the image center, with the image size determined by the attribute count. VFP outperforms traditional machine learning methods and existing conversion techniques in tests across seven datasets with varying real-valued attributes.

Vortex Feature Positioning: Bridging Tabular IIoT Data and Image-Based Deep Learning

TL;DR

VFP introduces a correlation-driven, vortex-based method to convert high-dimensional IIoT tabular data into images tailored for CNNs, addressing overfitting and inefficiency of fixed-size representations. By embedding features with consideration of convolution operations and arranging them in a center-out vortex based on PCC, VFP yields flexible image sizes proportional to attribute count and improves generalization. Theoretical analysis links the structure to favorable optimization properties, including a convergence rate for SGD on VFP-generated data. Empirical evaluation on seven datasets shows VFP outperforms traditional tree-based methods and existing image-conversion approaches, underscoring its practical impact for scalable IIoT analytics.

Abstract

Tabular data from IIoT devices are typically analyzed using decision tree-based machine learning techniques, which struggle with high-dimensional and numeric data. To overcome these limitations, techniques converting tabular data into images have been developed, leveraging the strengths of image-based deep learning approaches such as Convolutional Neural Networks. These methods cluster similar features into distinct image areas with fixed sizes, regardless of the number of features, resembling actual photographs. However, this increases the possibility of overfitting, as similar features, when selected carefully in a tabular format, are often discarded to prevent this issue. Additionally, fixed image sizes can lead to wasted pixels with fewer features, resulting in computational inefficiency. We introduce Vortex Feature Positioning (VFP) to address these issues. VFP arranges features based on their correlation, spacing similar ones in a vortex pattern from the image center, with the image size determined by the attribute count. VFP outperforms traditional machine learning methods and existing conversion techniques in tests across seven datasets with varying real-valued attributes.
Paper Structure (21 sections, 19 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 21 sections, 19 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Three cases of feature positioning considering the number of features per convolution operation. From the top: zero padding of sizes 1-2 and distancing.
  • Figure 2: Vortex Feature Positioning and forming a 3-channel image by copying a 2-D matrix.
  • Figure 3: Converted images of IGTD and VFP (with distancing) using Iris and SECOM datasets.
  • Figure 4: Part a) lines correspond to the test accuracies presented in Table \ref{['table:test_result_vfp_only']}, while part b) lines relate to the test accuracies in Table \ref{['table:test_result_vfp_ml_others']}. VFP consistently shows better testing performance than other methods.