Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries
Yifei Yang, Hangdong Zhao, Xiangyao Yu, Paraschos Koutris
TL;DR
This work introduces predicate transfer, a generalization of Bloom join that propagates table-local predicates across multi-table joins using Bloom filters within a DAG-based predicate transfer graph. Inspired by Yannakakis, it aims to achieve near-optimal pre-filtering without the heavy overhead of full semi-join phases, enabling efficient processing of complex join graphs. Empirical results on TPC-H show substantial speedups over baseline Bloom join and competitive robustness compared to Yannakakis, especially on queries with many joins. The approach is extended to general operators and non-acyclic graphs, with potential for further theoretical guarantees and parallelization.
Abstract
This paper presents predicate transfer, a novel method that optimizes join performance by pre-filtering tables to reduce the join input sizes. Predicate transfer generalizes Bloom join, which conducts pre-filtering within a single join operation, to multi-table joins such that the filtering benefits can be significantly increased. Predicate transfer is inspired by the seminal theoretical results by Yannakakis, which uses semi-joins to pre-filter acyclic queries. Predicate transfer generalizes the theoretical results to any join graphs and use Bloom filters to replace semi-joins leading to significant speedup. Evaluation shows predicate transfer can outperform Bloom join by 3.1x on average on TPC-H benchmark.
