Table of Contents
Fetching ...

RISK: Efficiently processing rich spatial-keyword queries on encrypted geo-textual data

Zhen Lv, Cong Cao, Hongwei Huo, Jiangtao Cui, Yanguo Peng, Hui Li, Yingfan Liu

TL;DR

RISK is a model for rich spatial-keyword queries on encrypted geo-textual data that seamlessly supports both secure range and k-nearest neighbor queries, is provably secure under IND-CKA2 model, and extensible to multi-party scenarios and dynamic updates.

Abstract

Symmetric searchable encryption (SSE) for geo-textual data has attracted significant attention. However, existing schemes rely on task-specific, incompatible indices for isolated specific secure queries (e.g., range or k-nearest neighbor spatial-keyword queries), limiting practicality due to prohibitive multi-index overhead. To address this, we propose RISK, a model for rich spatial-keyword queries on encrypted geo-textual data. In a textual-first-then-spatial manner, RISK is built on a novel k-nearest neighbor quadtree (kQ-tree) that embeds representative and regional nearest neighbors, with the kQ-tree further encrypted using standard cryptographic tools (e.g., keyed hash functions and symmetric encryption). Overall, RISK seamlessly supports both secure range and k-nearest neighbor queries, is provably secure under IND-CKA2 model, and extensible to multi-party scenarios and dynamic updates. Experiments on three real-world and one synthetic datasets show that RISK outperforms state-of-the-art methods by at least 0.5 and 4 orders of magnitude in response time for 1% range queries and 10-nearest neighbor queries, respectively.

RISK: Efficiently processing rich spatial-keyword queries on encrypted geo-textual data

TL;DR

RISK is a model for rich spatial-keyword queries on encrypted geo-textual data that seamlessly supports both secure range and k-nearest neighbor queries, is provably secure under IND-CKA2 model, and extensible to multi-party scenarios and dynamic updates.

Abstract

Symmetric searchable encryption (SSE) for geo-textual data has attracted significant attention. However, existing schemes rely on task-specific, incompatible indices for isolated specific secure queries (e.g., range or k-nearest neighbor spatial-keyword queries), limiting practicality due to prohibitive multi-index overhead. To address this, we propose RISK, a model for rich spatial-keyword queries on encrypted geo-textual data. In a textual-first-then-spatial manner, RISK is built on a novel k-nearest neighbor quadtree (kQ-tree) that embeds representative and regional nearest neighbors, with the kQ-tree further encrypted using standard cryptographic tools (e.g., keyed hash functions and symmetric encryption). Overall, RISK seamlessly supports both secure range and k-nearest neighbor queries, is provably secure under IND-CKA2 model, and extensible to multi-party scenarios and dynamic updates. Experiments on three real-world and one synthetic datasets show that RISK outperforms state-of-the-art methods by at least 0.5 and 4 orders of magnitude in response time for 1% range queries and 10-nearest neighbor queries, respectively.
Paper Structure (30 sections, 1 theorem, 23 equations, 10 figures, 2 tables, 5 algorithms)

This paper contains 30 sections, 1 theorem, 23 equations, 10 figures, 2 tables, 5 algorithms.

Key Result

Theorem 1

RISK is IND-CKA2 $(\mathcal{L}_1,\mathcal{L}_2)$-secure in the random oracle model. For any probabilistic polynomial-time adversary $\mathcal{A}$ attempting to break RISK, its advantage is bounded by Herein, $\mathsf{Game}_{\mathcal{R}}$ and $\mathsf{Game}_{\mathcal{S}}$ denote the real and simulated games, respectively, and $\epsilon$ is an adjustable polynomial expansion factor for range querie

Figures (10)

  • Figure 1: Application scenarios for range and $2$NN spatial-keyword queries.
  • Figure 2: The architecture of RISK. Here, the dotted arrows indicate off-line processes, and the continuous arrows are on-line processes which directly affects query time and is therefore critical to user experience. Thus, RISK mainly aims at reducing on-line time.
  • Figure 3: Generic indexing concept ($k_{\max}=2$ and $d=3$). Virtual nodes (i.e., inner nodes), which only store structural semantics, are denoted by circles; real nodes (i.e., leaf nodes), which store objects, are represented by diamonds. All light blue numbers in the cells are unique identifiers, identical to the path of their corresponding nodes.
  • Figure 4: An example for generating trapdoors for RISK. Here, in the left region, the red circle denotes a RSK query $q_r$ and the object $q_k$ colored by red dot is the center for both NSK and kSK queries. Here, $k=2$ is for kSK query. For both regions, the patterned, light-colored, and dark-colored cells indicate touched cells and nodes for RSK, NSK, and kSK queries, respectively.
  • Figure 5: The parameter tuning results for RISK.
  • ...and 5 more figures

Theorems & Definitions (5)

  • Definition 1: Secure range spatial-keyword query problem
  • Definition 2: Secure k-nearest neighbor spatial-keyword query problem
  • Definition 3: IND-CKA2 security
  • Theorem 1: IND-CKA2 security for RISK
  • Proof 1