Table of Contents
Fetching ...

Differentially Private Learned Indexes

Jianzhang Du, Tilak Mudgal, Rutvi Rahul Gadre, Yukui Luo, Chenghong Wang

TL;DR

This paper proposes leveraging learned indexes, a trending technique that repurposes machine learning models as indexing structures, to build more compact DP indexes that enable faster access to encrypted data while maintaining provable privacy guarantees.

Abstract

In this paper, we address the problem of efficiently answering predicate queries on encrypted databases, those secured by Trusted Execution Environments (TEEs), which enable untrusted providers to process encrypted user data without revealing its contents. A common strategy in modern databases to accelerate predicate queries is the use of indexes, which map attribute values (keys) to their corresponding positions in a sorted data array. This allows for fast lookup and retrieval of data subsets that satisfy specific predicates. Unfortunately, indexes cannot be directly applied to encrypted databases due to strong data dependent leakages. Recent approaches apply differential privacy (DP) to construct noisy indexes that enable faster access to encrypted data while maintaining provable privacy guarantees. However, these methods often suffer from large storage costs, with index sizes typically scaling linearly with the key space. To address this challenge, we propose leveraging learned indexes, a trending technique that repurposes machine learning models as indexing structures, to build more compact DP indexes.

Differentially Private Learned Indexes

TL;DR

This paper proposes leveraging learned indexes, a trending technique that repurposes machine learning models as indexing structures, to build more compact DP indexes that enable faster access to encrypted data while maintaining provable privacy guarantees.

Abstract

In this paper, we address the problem of efficiently answering predicate queries on encrypted databases, those secured by Trusted Execution Environments (TEEs), which enable untrusted providers to process encrypted user data without revealing its contents. A common strategy in modern databases to accelerate predicate queries is the use of indexes, which map attribute values (keys) to their corresponding positions in a sorted data array. This allows for fast lookup and retrieval of data subsets that satisfy specific predicates. Unfortunately, indexes cannot be directly applied to encrypted databases due to strong data dependent leakages. Recent approaches apply differential privacy (DP) to construct noisy indexes that enable faster access to encrypted data while maintaining provable privacy guarantees. However, these methods often suffer from large storage costs, with index sizes typically scaling linearly with the key space. To address this challenge, we propose leveraging learned indexes, a trending technique that repurposes machine learning models as indexing structures, to build more compact DP indexes.

Paper Structure

This paper contains 10 sections, 2 theorems, 10 equations, 4 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

For any CFC $\{x_i, y_i\}_{i=1}^{N}$, and let $\{x_i, \tilde{y}_i\}_{i=1}^N$ to be the corresponding noisy CFC released by the range tree mechanism. Then, $\forall y_i$, the error $|\tilde{y}_i - y_i|$ is bounded by $O(\epsilon^{-1}{(\log{N})^{\frac{3}{2}}})$.

Figures (4)

  • Figure 1: EDB workflow
  • Figure 2: CF model indexes
  • Figure 3: RMI example
  • Figure 4: Range tree and DP range tree.

Theorems & Definitions (3)

  • Theorem 1
  • Lemma 2
  • proof