Table of Contents
Fetching ...

Towards Privacy-Preserving Range Queries with Secure Learned Spatial Index over Encrypted Data

Zuan Wang, Juntao Lu, Jiazhuang Wu, Youliang Tian, Wei Song, Qiuxian Li, Duo Zhang

TL;DR

This work tackles privacy-preserving range queries on encrypted data outsourced to the cloud by introducing SLS-INDEX, a secure learned spatial index that leverages Paillier homomorphic encryption, a hierarchical predictor, and dummy/noise-based obfuscation to protect data, queries, results, and access patterns. The index uses Z-curve ranking and a trio of MLP-based predictors (head, intermediate, leaf) with encrypted parameters, plus a fuzzy label mechanism and dummy buckets to conceal traversal paths. A suite of secure sub-protocols—secure bucket prediction, secure point extraction, and an integrated SL RQ protocol—obfuscates query execution and reduces encrypted data scans, while providing formal leakage-function-based security guarantees. Empirical results on real and synthetic datasets show substantial query-efficiency gains over prior privacy-preserving schemes, with robust protection for all privacy aspects including access patterns. The approach enables practical, privacy-preserving spatial querying on encrypted cloud data with dynamic updates and scalable performance.

Abstract

With the growing reliance on cloud services for large-scale data management, preserving the security and privacy of outsourced datasets has become increasingly critical. While encrypting data and queries can prevent direct content exposure, recent research reveals that adversaries can still infer sensitive information via access pattern and search path analysis. However, existing solutions that offer strong access pattern privacy often incur substantial performance overhead. In this paper, we propose a novel privacy-preserving range query scheme over encrypted datasets, offering strong security guarantees while maintaining high efficiency. To achieve this, we develop secure learned spatial index (SLS-INDEX), a secure learned index that integrates the Paillier cryptosystem with a hierarchical prediction architecture and noise-injected buckets, enabling data-aware query acceleration in the encrypted domain. To further obfuscate query execution paths, SLS-INDEXbased Range Queries (SLRQ) employs a permutation-based secure bucket prediction protocol. Additionally, we introduce a secure point extraction protocol that generates candidate results to reduce the overhead of secure computation. We provide formal security analysis under realistic leakage functions and implement a prototype to evaluate its practical performance. Extensive experiments on both real-world and synthetic datasets demonstrate that SLRQ significantly outperforms existing solutions in query efficiency while ensuring dataset, query, result, and access pattern privacy.

Towards Privacy-Preserving Range Queries with Secure Learned Spatial Index over Encrypted Data

TL;DR

This work tackles privacy-preserving range queries on encrypted data outsourced to the cloud by introducing SLS-INDEX, a secure learned spatial index that leverages Paillier homomorphic encryption, a hierarchical predictor, and dummy/noise-based obfuscation to protect data, queries, results, and access patterns. The index uses Z-curve ranking and a trio of MLP-based predictors (head, intermediate, leaf) with encrypted parameters, plus a fuzzy label mechanism and dummy buckets to conceal traversal paths. A suite of secure sub-protocols—secure bucket prediction, secure point extraction, and an integrated SL RQ protocol—obfuscates query execution and reduces encrypted data scans, while providing formal leakage-function-based security guarantees. Empirical results on real and synthetic datasets show substantial query-efficiency gains over prior privacy-preserving schemes, with robust protection for all privacy aspects including access patterns. The approach enables practical, privacy-preserving spatial querying on encrypted cloud data with dynamic updates and scalable performance.

Abstract

With the growing reliance on cloud services for large-scale data management, preserving the security and privacy of outsourced datasets has become increasingly critical. While encrypting data and queries can prevent direct content exposure, recent research reveals that adversaries can still infer sensitive information via access pattern and search path analysis. However, existing solutions that offer strong access pattern privacy often incur substantial performance overhead. In this paper, we propose a novel privacy-preserving range query scheme over encrypted datasets, offering strong security guarantees while maintaining high efficiency. To achieve this, we develop secure learned spatial index (SLS-INDEX), a secure learned index that integrates the Paillier cryptosystem with a hierarchical prediction architecture and noise-injected buckets, enabling data-aware query acceleration in the encrypted domain. To further obfuscate query execution paths, SLS-INDEXbased Range Queries (SLRQ) employs a permutation-based secure bucket prediction protocol. Additionally, we introduce a secure point extraction protocol that generates candidate results to reduce the overhead of secure computation. We provide formal security analysis under realistic leakage functions and implement a prototype to evaluate its practical performance. Extensive experiments on both real-world and synthetic datasets demonstrate that SLRQ significantly outperforms existing solutions in query efficiency while ensuring dataset, query, result, and access pattern privacy.

Paper Structure

This paper contains 23 sections, 1 theorem, 9 equations, 11 figures, 6 tables, 3 algorithms.

Key Result

Theorem 1

is $(\mathcal{L}_{\text{Build}}, \mathcal{L}_{\text{Update}}, \mathcal{L}_{\text{Query}})$-secure under the assumptions that $\pi$ and $f$ are pseudo-random and that the Paillier cryptosystem is semantically secure.

Figures (11)

  • Figure 1: Example of range queries on the cloud platform.
  • Figure 2: System model.
  • Figure 3: Structure of . This secure index comprises a head predictor, intermediate predictors and leaf predictors. The head and intermediate predictors employ $\widetilde{\mathcal{M}}^{e}$, while leaf predictors use ${\mathcal{M}}^{e}$ to preserve data confidentiality. Moreover, to hide the privacy of search patterns, also incorporates a fuzzy label mechanism along with carefully designed dummy predictors and buckets.
  • Figure 4: Impact of $b$ on index construction.
  • Figure 5: Impact of $K$ on index construction.
  • ...and 6 more figures

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Theorem 1
  • proof