Table of Contents
Fetching ...

D-CNN and VQ-VAE Autoencoders for Compression and Denoising of Industrial X-ray Computed Tomography Images

Bardia Hejazi, Keerthana Chand, Tobias Fritsch, Giovanni Bruno

TL;DR

This work addresses the data-storage challenge in industrial XCT by comparing two deep autoencoder approaches, a D-CNN and a VQ-VAE, across multiple compression rates for sandstone XCT data. It introduces an edge-sensitive metric, MSLE, to better quantify fine-feature preservation beyond traditional MSE/PSNR. Findings show both models preserve overall structure and porosity at moderate compression, but the VQ-VAE handles higher compression more robustly, especially in preserving edges, while the D-CNN can fail at the largest compression. The results guide practitioners in selecting compression schemes based on analysis needs and highlight MSLE as a valuable tool for evaluating edge- and feature-preservation in 3D XCT data.

Abstract

The ever-growing volume of data in imaging sciences stemming from the advancements in imaging technologies, necessitates efficient and reliable storage solutions for such large datasets. This study investigates the compression of industrial X-ray computed tomography (XCT) data using deep learning autoencoders and examines how these compression algorithms affect the quality of the recovered data. Two network architectures with different compression rates were used, a deep convolution neural network (D-CNN) and a vector quantized variational autoencoder (VQ-VAE). The XCT data used was from a sandstone sample with a complex internal pore network. The quality of the decoded images obtained from the two different deep learning architectures with different compression rates were quantified and compared to the original input data. In addition, to improve image decoding quality metrics, we introduced a metric sensitive to edge preservation, which is crucial for three-dimensional data analysis. We showed that different architectures and compression rates are required depending on the specific characteristics needed to be preserved for later analysis. The findings presented here can aid scientists to determine the requirements and strategies for their data storage and analysis needs.

D-CNN and VQ-VAE Autoencoders for Compression and Denoising of Industrial X-ray Computed Tomography Images

TL;DR

This work addresses the data-storage challenge in industrial XCT by comparing two deep autoencoder approaches, a D-CNN and a VQ-VAE, across multiple compression rates for sandstone XCT data. It introduces an edge-sensitive metric, MSLE, to better quantify fine-feature preservation beyond traditional MSE/PSNR. Findings show both models preserve overall structure and porosity at moderate compression, but the VQ-VAE handles higher compression more robustly, especially in preserving edges, while the D-CNN can fail at the largest compression. The results guide practitioners in selecting compression schemes based on analysis needs and highlight MSLE as a valuable tool for evaluating edge- and feature-preservation in 3D XCT data.

Abstract

The ever-growing volume of data in imaging sciences stemming from the advancements in imaging technologies, necessitates efficient and reliable storage solutions for such large datasets. This study investigates the compression of industrial X-ray computed tomography (XCT) data using deep learning autoencoders and examines how these compression algorithms affect the quality of the recovered data. Two network architectures with different compression rates were used, a deep convolution neural network (D-CNN) and a vector quantized variational autoencoder (VQ-VAE). The XCT data used was from a sandstone sample with a complex internal pore network. The quality of the decoded images obtained from the two different deep learning architectures with different compression rates were quantified and compared to the original input data. In addition, to improve image decoding quality metrics, we introduced a metric sensitive to edge preservation, which is crucial for three-dimensional data analysis. We showed that different architectures and compression rates are required depending on the specific characteristics needed to be preserved for later analysis. The findings presented here can aid scientists to determine the requirements and strategies for their data storage and analysis needs.

Paper Structure

This paper contains 3 sections, 3 equations, 7 figures, 4 tables.

Table of Contents

  1. Methods
  2. Results
  3. Conclusions

Figures (7)

  • Figure 1: Example slice from the XCT data of sandstone with original dimensions of $998 \times 998$ pixels and the cropped section used for the compression analysis with $512 \times 512$ pixels.
  • Figure 2: Schematic of the architectures used to compress $512 \times 512$ input images to an $128 \times 128 \times 8$ encoded array. (a) CNN autoencoder showing the encoder and decoder with details of their convolution, pooling and up sampling layers. (b) VQ-VAE autoencoder showing the encoder, decoder, and quantization layer (Visualization done using visualkeras package Gavrikov2020).
  • Figure 3: (a) Original $512 \times 512$ 8-bit image. Output images obtained from the (b,c,d) CNN and (e,f,g) VQ-VAE autoencoder with different compression rates. The compressions to $128 \times 128 \times 8$, $64 \times 64 \times 4$, and $32 \times 32 \times 2$ encoded arrays were examined. The quality of the output images decrease as the compression rate increases for the two architectures.
  • Figure 4: (a) Lower Otsu threshold of the original 8-bit image. (b,c) Lower Otsu threshold of the decoded image obtained from the D-CNN model. (d,e) Lower Otsu threshold of the decoded image obtained from the VQ-VAE model. The output images had an increased denoising effect as the compression rate also increased.
  • Figure 5: (a) Original binarized image. Output images obtained from the (b,c) D-CNN and (d,e) VQ-VAE models with the different compression rates.
  • ...and 2 more figures