Table of Contents
Fetching ...

The Impact of Data Compression in Real-Time and Historical Data Acquisition Systems on the Accuracy of Analytical Solutions

Reham Faqehi, Haya Alhuraib, Hamad Saiari, Zyad Bamigdad

TL;DR

The paper addresses the tension between data compression and analytic accuracy in industrial real-time and historical data systems. It employs a mixed-method approach combining a literature review, simulated swinging-door compression experiments, and univariate model evaluation on raw versus compressed data, reporting metrics such as $MAE$, $MSE$, and $RMSE$. Results show that aggressive, one-size-fits-all compression degrades key analytics, especially for high-frequency signals, while signal-aware, conservative thresholds and shape-preserving methods can preserve analytical fidelity. The work provides domain-specific guidelines for selecting compression strategies that balance storage efficiency with actionable analytics in Industry 4.0 contexts.

Abstract

In industrial and IoT environments, massive amounts of real-time and historical process data are continuously generated and archived. With sensors and devices capturing every operational detail, the volume of time-series data has become a critical challenge for storage and processing systems. Efficient data management is essential to ensure scalability, cost-effectiveness, and timely analytics. To minimize storage expenses and optimize performance, data compression algorithms are frequently utilized in data historians and acquisition systems. However, compression comes with trade-offs that may compromise the accuracy and reliability of engineering analytics that depend on this compressed data. Understanding these trade-offs is essential for developing data strategies that support both operational efficiency and accurate, reliable analytics. This paper assesses the relation of common compression mechanisms used in real-time and historical data systems and the accuracy of analytical solutions, including statistical analysis, anomaly detection, and machine learning models. Through theoretical analysis, simulated signal compression, and empirical assessment, we illustrate that excessive compression can lose critical patterns, skew statistical measures, and diminish predictive accuracy. The study suggests optimum methods and best practices for striking a compromise between analytical integrity and compression efficiency.

The Impact of Data Compression in Real-Time and Historical Data Acquisition Systems on the Accuracy of Analytical Solutions

TL;DR

The paper addresses the tension between data compression and analytic accuracy in industrial real-time and historical data systems. It employs a mixed-method approach combining a literature review, simulated swinging-door compression experiments, and univariate model evaluation on raw versus compressed data, reporting metrics such as , , and . Results show that aggressive, one-size-fits-all compression degrades key analytics, especially for high-frequency signals, while signal-aware, conservative thresholds and shape-preserving methods can preserve analytical fidelity. The work provides domain-specific guidelines for selecting compression strategies that balance storage efficiency with actionable analytics in Industry 4.0 contexts.

Abstract

In industrial and IoT environments, massive amounts of real-time and historical process data are continuously generated and archived. With sensors and devices capturing every operational detail, the volume of time-series data has become a critical challenge for storage and processing systems. Efficient data management is essential to ensure scalability, cost-effectiveness, and timely analytics. To minimize storage expenses and optimize performance, data compression algorithms are frequently utilized in data historians and acquisition systems. However, compression comes with trade-offs that may compromise the accuracy and reliability of engineering analytics that depend on this compressed data. Understanding these trade-offs is essential for developing data strategies that support both operational efficiency and accurate, reliable analytics. This paper assesses the relation of common compression mechanisms used in real-time and historical data systems and the accuracy of analytical solutions, including statistical analysis, anomaly detection, and machine learning models. Through theoretical analysis, simulated signal compression, and empirical assessment, we illustrate that excessive compression can lose critical patterns, skew statistical measures, and diminish predictive accuracy. The study suggests optimum methods and best practices for striking a compromise between analytical integrity and compression efficiency.

Paper Structure

This paper contains 15 sections, 1 equation, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Compression Applied to Temperature Signal
  • Figure 2: Compression Applied to Vibration Signal
  • Figure 3: Compression Threshold vs RMSE
  • Figure 4: Compression of thresholds 0.713, 0.444, and 1.783
  • Figure 5: Anomaly Detection Recall vs Compression