Table of Contents
Fetching ...

Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries

Yifei Yang, Hangdong Zhao, Xiangyao Yu, Paraschos Koutris

TL;DR

This work introduces predicate transfer, a generalization of Bloom join that propagates table-local predicates across multi-table joins using Bloom filters within a DAG-based predicate transfer graph. Inspired by Yannakakis, it aims to achieve near-optimal pre-filtering without the heavy overhead of full semi-join phases, enabling efficient processing of complex join graphs. Empirical results on TPC-H show substantial speedups over baseline Bloom join and competitive robustness compared to Yannakakis, especially on queries with many joins. The approach is extended to general operators and non-acyclic graphs, with potential for further theoretical guarantees and parallelization.

Abstract

This paper presents predicate transfer, a novel method that optimizes join performance by pre-filtering tables to reduce the join input sizes. Predicate transfer generalizes Bloom join, which conducts pre-filtering within a single join operation, to multi-table joins such that the filtering benefits can be significantly increased. Predicate transfer is inspired by the seminal theoretical results by Yannakakis, which uses semi-joins to pre-filter acyclic queries. Predicate transfer generalizes the theoretical results to any join graphs and use Bloom filters to replace semi-joins leading to significant speedup. Evaluation shows predicate transfer can outperform Bloom join by 3.1x on average on TPC-H benchmark.

Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries

TL;DR

This work introduces predicate transfer, a generalization of Bloom join that propagates table-local predicates across multi-table joins using Bloom filters within a DAG-based predicate transfer graph. Inspired by Yannakakis, it aims to achieve near-optimal pre-filtering without the heavy overhead of full semi-join phases, enabling efficient processing of complex join graphs. Empirical results on TPC-H show substantial speedups over baseline Bloom join and competitive robustness compared to Yannakakis, especially on queries with many joins. The approach is extended to general operators and non-acyclic graphs, with potential for further theoretical guarantees and parallelization.

Abstract

This paper presents predicate transfer, a novel method that optimizes join performance by pre-filtering tables to reduce the join input sizes. Predicate transfer generalizes Bloom join, which conducts pre-filtering within a single join operation, to multi-table joins such that the filtering benefits can be significantly increased. Predicate transfer is inspired by the seminal theoretical results by Yannakakis, which uses semi-joins to pre-filter acyclic queries. Predicate transfer generalizes the theoretical results to any join graphs and use Bloom filters to replace semi-joins leading to significant speedup. Evaluation shows predicate transfer can outperform Bloom join by 3.1x on average on TPC-H benchmark.
Paper Structure (15 sections, 4 equations, 6 figures, 2 tables)

This paper contains 15 sections, 4 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Predicate Transfer for TPC-H Q5.
  • Figure 2: Filter Transformation --- Table $R$ receives two incoming filters on join attributes $A$ and $B$, and generates a transformed outgoing filter on join attribute $C$.
  • Figure 3: Example of Predicate Transfer on a Join Query --- $R \bowtie S \bowtie T$.
  • Figure 4: Performance Evaluation of Predicate Transfer on TPC-H (normalized to NoPredTrans).
  • Figure 5: Performance Breakdown on TPC-H Q5.
  • ...and 1 more figures