Realizing Hardware-Optimized General Tree-Based Data Structures for Heterogeneous System Classes
Daniel Biebert, Christian Hakert, Jian-Jia Chen
TL;DR
Realizing Hardware-Optimized General Tree-Based Data Structures for Heterogeneous System Classes investigates memory-aware reordering of tree nodes to improve performance without changing tree logic. It introduces two memory models (heterogeneous and hierarchical) and multiple reordering strategies, including two in-place Merge Sort variants, Path Reorder, and a Cycle Sort–based Map Reorder, plus online/offline decision mechanisms. Empirical results on MSP430-like heterogeneous memory and EPYC hardware show strongest gains for Path Reorder in hierarchical memory with deep/thin trees, while offline Merge Sort variants provide additional, though platform-dependent, improvements; online reordering can yield substantial speedups in certain BST/AVL configurations. The work demonstrates the potential of memory-aware tree optimization and offers practical strategies for both offline preparation and online adaptation across varied hardware.
Abstract
Tree-based data structures are ubiquitous across applications. Therefore, a multitude of different tree implementations exist. However, while these implementations are diverse, they share a tree structure as the underlying data structure. As such, the access patterns inside these trees are very similar, following a path from the root of the tree towards a leaf node. Similarly, many distinct types of memory exist. These types of memory all have different characteristics. Some of these have an impact on the overall system performance. While the concrete types of memory are varied, their characteristics can often be abstracted to have a similar effect on the performance. We show how the characteristics of different types of memories can be used to improve the performance of tree-based data structures. By reordering the nodes of a tree inside memory, the characteristics of memory can be exploited to optimize the performance. To this end, this paper presents different strategies for reordering nodes inside memory as well as efficient algorithms for realizing these strategies. It additionally provides strategies to decide when such a reordering operation should be triggered during operation. Further, this paper conducts experiments showing the performance impact of the proposed strategies. The experiments show that the strategies can improve the performance of trees by up to 95\% as offline optimization and 75\% as online optimization.
