Table of Contents
Fetching ...

Variable Rate Neural Compression for Sparse Detector Data

Yi Huang, Yeonju Go, Jin Huang, Shuhang Li, Xihaier Luo, Thomas Marshall, Joseph Osborn, Christopher Pinkenburg, Yihui Ren, Evgeny Shulga, Shinjae Yoo, Byung-Jun Yoon

TL;DR

A novel approach for TPC data compression via key-point identification facilitated by sparse convolution via key-point identification facilitated by sparse convolution is proposed, which achieves a 75% improvement in reconstruction accuracy and an increase in compression ratio over the previous state-of-the-art model.

Abstract

High-energy large-scale particle colliders generate data at extraordinary rates. Developing real-time high-throughput data compression algorithms to reduce data volume and meet the bandwidth requirement for storage has become increasingly critical. Deep learning is a promising technology that can address this challenging topic. At the newly constructed sPHENIX experiment at the Relativistic Heavy Ion Collider, a Time Projection Chamber (TPC) serves as the main tracking detector, which records three-dimensional particle trajectories in a volume of a gas-filled cylinder. In terms of occupancy, the resulting data flow can be very sparse reaching $10^{-3}$ for proton-proton collisions. Such sparsity presents a challenge to conventional learning-free lossy compression algorithms, such as SZ, ZFP, and MGARD. In contrast, emerging deep learning-based models, particularly those utilizing convolutional neural networks for compression, have outperformed these conventional methods in terms of compression ratios and reconstruction accuracy. However, research on the efficacy of these deep learning models in handling sparse datasets, like those produced in particle colliders, remains limited. Furthermore, most deep learning models do not adapt their processing speeds to data sparsity, which affects efficiency. To address this issue, we propose a novel approach for TPC data compression via key-point identification facilitated by sparse convolution. Our proposed algorithm, BCAE-VS, achieves a $75\%$ improvement in reconstruction accuracy with a $10\%$ increase in compression ratio over the previous state-of-the-art model. Additionally, BCAE-VS manages to achieve these results with a model size over two orders of magnitude smaller. Lastly, we have experimentally verified that as sparsity increases, so does the model's throughput.

Variable Rate Neural Compression for Sparse Detector Data

TL;DR

A novel approach for TPC data compression via key-point identification facilitated by sparse convolution via key-point identification facilitated by sparse convolution is proposed, which achieves a 75% improvement in reconstruction accuracy and an increase in compression ratio over the previous state-of-the-art model.

Abstract

High-energy large-scale particle colliders generate data at extraordinary rates. Developing real-time high-throughput data compression algorithms to reduce data volume and meet the bandwidth requirement for storage has become increasingly critical. Deep learning is a promising technology that can address this challenging topic. At the newly constructed sPHENIX experiment at the Relativistic Heavy Ion Collider, a Time Projection Chamber (TPC) serves as the main tracking detector, which records three-dimensional particle trajectories in a volume of a gas-filled cylinder. In terms of occupancy, the resulting data flow can be very sparse reaching for proton-proton collisions. Such sparsity presents a challenge to conventional learning-free lossy compression algorithms, such as SZ, ZFP, and MGARD. In contrast, emerging deep learning-based models, particularly those utilizing convolutional neural networks for compression, have outperformed these conventional methods in terms of compression ratios and reconstruction accuracy. However, research on the efficacy of these deep learning models in handling sparse datasets, like those produced in particle colliders, remains limited. Furthermore, most deep learning models do not adapt their processing speeds to data sparsity, which affects efficiency. To address this issue, we propose a novel approach for TPC data compression via key-point identification facilitated by sparse convolution. Our proposed algorithm, BCAE-VS, achieves a improvement in reconstruction accuracy with a increase in compression ratio over the previous state-of-the-art model. Additionally, BCAE-VS manages to achieve these results with a model size over two orders of magnitude smaller. Lastly, we have experimentally verified that as sparsity increases, so does the model's throughput.

Paper Structure

This paper contains 21 sections, 5 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: sPHENIX detector assembly. Time projection chamber (TPC) contributes the majority of data.
  • Figure 2: A schematic view of sPHENIX TPC.
  • Figure 3: Visualization of the outer layer group TPC data of an Au$+$Au collision event with $0$-$10$% centrality and 170kHz pileup.
  • Figure 4: Bicephalous convolutional autoencoder.
  • Figure 5: Bicephalous autoencoder with sparse encoder for key-points identification.
  • ...and 7 more figures