Cascaded Learned Bloom Filter for Optimal Model-Filter Size Balance and Fast Rejection

Atsuki Sato; Yusuke Matsui

Cascaded Learned Bloom Filter for Optimal Model-Filter Size Balance and Fast Rejection

Atsuki Sato, Yusuke Matsui

TL;DR

The paper tackles efficient approximate membership querying by addressing two weaknesses of learned Bloom filters: suboptimal balance between the machine learning model size and Bloom-filter size, and non-ideal reject times. It introduces the Cascaded Learned Bloom Filter (CLBF), a cascaded architecture that alternates between score-based branching from multiple ML stages and Bloom-filter filtering, optimized via dynamic programming to minimize a weighted combination of memory and expected reject time under a target false-positive rate $F$. The authors formulate precise memory and latency objectives, define a tractable DP routine with complexity $\mathcal{O}(\bar{D}P^2 + \bar{D}PK)$, and discretize key parameters to enable practical optimization. Empirical results on Malicious URLs and EMBER datasets show that CLBF reduces memory usage by up to 24% and reduces reject time by up to 14x compared with PLBF, demonstrating improved memory efficiency and faster rejections suitable for latency-sensitive, memory-constrained applications.

Abstract

Recent studies have demonstrated that learned Bloom filters, which combine machine learning with the classical Bloom filter, can achieve superior memory efficiency. However, existing learned Bloom filters face two critical unresolved challenges: the balance between the machine learning model size and the Bloom filter size is not optimal, and the reject time cannot be minimized effectively. We propose the Cascaded Learned Bloom Filter (CLBF) to address these issues. Our dynamic programming-based optimization automatically selects configurations that achieve an optimal balance between the model and filter sizes while minimizing reject time. Experiments on real-world datasets show that CLBF reduces memory usage by up to 24% and decreases reject time by up to 14 times compared to state-of-the-art learned Bloom filters.

Cascaded Learned Bloom Filter for Optimal Model-Filter Size Balance and Fast Rejection

TL;DR

Abstract

Cascaded Learned Bloom Filter for Optimal Model-Filter Size Balance and Fast Rejection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (17)