Linked Array Tree: A Constant-Time Search Structure for Big Data
Songpeng Liu
TL;DR
The paper addresses the challenge of fast search in big data by proposing the Linked Array Tree (LAT), a radix-based multi-level structure designed for constant-time access with low memory overhead. LAT stores values on a data level and uses index-level pointers, omitting explicit key storage, with search paths determined by remainders and bitwise operations when the radix is a power of two. The authors provide time-complexity analyses showing constant-time-like behavior for fixed radix and height, and discuss optimizations that leverage bitwise calculations. Empirical comparisons against red-black trees and B+-trees indicate strong performance for data-intensive workloads and lower memory usage, while highlighting trade-offs for sparse data and suggesting parallelization potential and variants (e.g., asymmetric radixes) to balance time and space. The work includes implementation details and a public repository, positioning LAT as a scalable option for large-scale memory and disk management in dense data regimes.
Abstract
As data volumes continue to grow rapidly, traditional search algorithms, like the red-black tree and B+ Tree, face increasing challenges in performance, especially in big data scenarios with intensive storage access. This paper presents the Linked Array Tree (LAT), a novel data structure designed to achieve constant-time complexity for search, insertion, and deletion operations. LAT leverages a sparse, non-moving hierarchical layout that enables direct access paths without requiring rebalancing or data movement. Its low memory overhead and avoidance of pointer-heavy structures make it well-suited for large-scale and intensive workloads. While not specifically tested under parallel or concurrent conditions, the structure's static layout and non-interfering operations suggest potential advantages in such environments. This paper first introduces the structure and algorithms of LAT, followed by a detailed analysis of its time complexity in search, insertion, and deletion operations. Finally, it presents experimental results across both data-intensive and sparse usage scenarios to evaluate LAT's practical performance.
