Table of Contents
Fetching ...

Parallel Writing of Nested Data in Columnar Formats

Jonas Hahnfeld, Jakob Blomer, Thorsten Kollegger

TL;DR

This paper proposes a scalable approach to efficient multithreaded writing of nested data in columnar format into a single file that removes the bottleneck of a single writer while staying fully compatible with the compressed, columnar, variably row-sized data representation.

Abstract

High Energy Physics (HEP) experiments, for example at the Large Hadron Collider (LHC) at CERN, store data at exabyte scale in sets of files. They use a binary columnar data format by the ROOT framework, that also transparently compresses the data. In this format, cells are not necessarily atomic but they may contain nested collections of variable size. The fact that row and block sizes are not known upfront makes it challenging to implement efficient parallel writing. In particular, the data cannot be organized in a regular grid where it is possible to precompute indices and offsets for independent writing. In this paper, we propose a scalable approach to efficient multithreaded writing of nested data in columnar format into a single file. Our approach removes the bottleneck of a single writer while staying fully compatible with the compressed, columnar, variably row-sized data representation. We discuss our design choices and the implementation of scalable parallel writing for ROOT's RNTuple format. An evaluation of our approach shows perfect scalability only limited by storage bandwidth for a synthetic benchmark. Finally we evaluate the benefits for a real-world application of dataset skimming.

Parallel Writing of Nested Data in Columnar Formats

TL;DR

This paper proposes a scalable approach to efficient multithreaded writing of nested data in columnar format into a single file that removes the bottleneck of a single writer while staying fully compatible with the compressed, columnar, variably row-sized data representation.

Abstract

High Energy Physics (HEP) experiments, for example at the Large Hadron Collider (LHC) at CERN, store data at exabyte scale in sets of files. They use a binary columnar data format by the ROOT framework, that also transparently compresses the data. In this format, cells are not necessarily atomic but they may contain nested collections of variable size. The fact that row and block sizes are not known upfront makes it challenging to implement efficient parallel writing. In particular, the data cannot be organized in a regular grid where it is possible to precompute indices and offsets for independent writing. In this paper, we propose a scalable approach to efficient multithreaded writing of nested data in columnar format into a single file. Our approach removes the bottleneck of a single writer while staying fully compatible with the compressed, columnar, variably row-sized data representation. We discuss our design choices and the implementation of scalable parallel writing for ROOT's RNTuple format. An evaluation of our approach shows perfect scalability only limited by storage bandwidth for a synthetic benchmark. Finally we evaluate the benefits for a real-world application of dataset skimming.

Paper Structure

This paper contains 17 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Simplified example of nested data structures. Real-world HEP data models often have thousands of fields.
  • Figure 2: Bandwidth measured with the synthetic benchmark writing to /dev/null. Every thread writes 20 million entries.
  • Figure 3: Bandwidth measured with the synthetic benchmark writing on a server SSD. Every thread writes 20 million entries.
  • Figure 4: Bandwidth measured with the synthetic benchmark writing on a server HDD. Every thread writes 10 million entries.
  • Figure 5: Speedup of the AGC dataset skimming benchmark compared to a full sequential run of 2432 seconds.