Indexing Join Inputs for Fast Queries and Maintenance
Wenhui Lyu, Goetz Graefe
TL;DR
The paper tackles the long-standing tension between fast query processing and low-maintenance intermediate storage for joins. It proposes the merged index, a proxy-store that interleaves two single-table indexes to support all join types while enabling non-blocking query processing and efficient incremental maintenance, and demonstrates it can be implemented on $B$-trees or log-structured merge-forests. Through extensive experiments on a binary join using TPC-C data, the merged index achieves query performance on par with materialized join views and maintenance efficiency comparable to traditional indexes, with only modest space overhead and natural compression of join results. The findings suggest the merged index is a practical, industrially viable storage primitive that combines the strengths of traditional indexes and materialized views for scalable, non-blocking database query processing.
Abstract
In database systems, joins are often expensive despite many years of research producing numerous join algorithms. Precomputed and materialized join views deliver the best query performance, whereas traditional indexes, used as pre-sorted inputs for merge joins, permit very efficient maintenance. Neither traditional indexes nor materialized join views require blocking phases, in contrast to query-time sorting and transient indexes, e.g., hash tables in hash joins, that impose high memory requirements and possibly spill to temporary storage. Here, we introduce a hybrid of traditional indexing and materialized join views. The *merged index* can be implemented with traditional b-trees, permits high-bandwidth maintenance using log-structured merge-forests, supports all join types (inner joins, all outer joins, all semi joins), and enables non-blocking query processing. Experiments across a wide range of scenarios confirm its query performance comparable to materialized join views and maintenance efficiency comparable to traditional indexes.
