Table of Contents
Fetching ...

Jodes: Efficient Oblivious Join in the Distributed Setting

Yilei Wang, Xiangdong Zeng, Sheng Wang, Feifei Li

TL;DR

Jodes addresses the problem of secure distributed analytics inside TEEs by delivering an oblivious distributed equi-join algorithm with strong security guarantees. It combines novel primitives including a fast oblivious shuffle (OPartition-based), an expansion operator, and a PK join technique to realize a general join while ensuring communication and computation obliviousness. The approach achieves linear communication costs in input and output sizes with competitive computation costs, and empirical results on a 16-server cluster show up to sixfold speedups over state-of-the-art baselines and robust performance across datasets and bandwidths. This work advances practical secure analytics by enabling efficient general equi-joins under strong privacy protections in distributed TEEs, reducing the overhead that plagues oblivious data processing.

Abstract

Trusted execution environment (TEE) has provided an isolated and secure environment for building cloud-based analytic systems, but it still suffers from access pattern leakages caused by side-channel attacks. To better secure the data, computation inside TEE enclave should be made oblivious, which introduces significant overhead and severely slows down the computation. A natural way to speed up is to build the analytic system with multiple servers in the distributed setting. However, this setting raises a new security concern -- the volumes of the transmissions among these servers can leak sensitive information to a network adversary. Existing works have designed specialized algorithms to address this concern, but their supports for equi-join, one of the most important but non-trivial database operators, are either inefficient, limited, or under a weak security assumption. In this paper, we present Jodes, an efficient oblivious join algorithm in the distributed setting. Jodes prevents the leakage on both the network and enclave sides, supports a general equi-join operation, and provides a high security level protection that only publicizes the input sizes and the output size. Meanwhile, it achieves both communication cost and computation cost asymptotically superior to existing algorithms. To demonstrate the practicality of Jodes, we conduct experiments in the distributed setting comprising 16 servers. Empirical results show that Jodes achieves up to a sixfold performance improvement over state-of-the-art join algorithms.

Jodes: Efficient Oblivious Join in the Distributed Setting

TL;DR

Jodes addresses the problem of secure distributed analytics inside TEEs by delivering an oblivious distributed equi-join algorithm with strong security guarantees. It combines novel primitives including a fast oblivious shuffle (OPartition-based), an expansion operator, and a PK join technique to realize a general join while ensuring communication and computation obliviousness. The approach achieves linear communication costs in input and output sizes with competitive computation costs, and empirical results on a 16-server cluster show up to sixfold speedups over state-of-the-art baselines and robust performance across datasets and bandwidths. This work advances practical secure analytics by enabling efficient general equi-joins under strong privacy protections in distributed TEEs, reducing the overhead that plagues oblivious data processing.

Abstract

Trusted execution environment (TEE) has provided an isolated and secure environment for building cloud-based analytic systems, but it still suffers from access pattern leakages caused by side-channel attacks. To better secure the data, computation inside TEE enclave should be made oblivious, which introduces significant overhead and severely slows down the computation. A natural way to speed up is to build the analytic system with multiple servers in the distributed setting. However, this setting raises a new security concern -- the volumes of the transmissions among these servers can leak sensitive information to a network adversary. Existing works have designed specialized algorithms to address this concern, but their supports for equi-join, one of the most important but non-trivial database operators, are either inefficient, limited, or under a weak security assumption. In this paper, we present Jodes, an efficient oblivious join algorithm in the distributed setting. Jodes prevents the leakage on both the network and enclave sides, supports a general equi-join operation, and provides a high security level protection that only publicizes the input sizes and the output size. Meanwhile, it achieves both communication cost and computation cost asymptotically superior to existing algorithms. To demonstrate the practicality of Jodes, we conduct experiments in the distributed setting comprising 16 servers. Empirical results show that Jodes achieves up to a sixfold performance improvement over state-of-the-art join algorithms.
Paper Structure (51 sections, 3 theorems, 6 equations, 5 figures, 5 tables, 7 algorithms)

This paper contains 51 sections, 3 theorems, 6 equations, 5 figures, 5 tables, 7 algorithms.

Key Result

Theorem 1

Setting $U_i=(1+c_i)n_i/p$, if the keys of $X[i]$ are all distinct for any $i$, then the shuffle by key algorithm fails with probability at most $2^{-\sigma}$, where $c_i=\sqrt{{2.08p(\sigma+2\log p)}/{n_i}}=o(1).$

Figures (5)

  • Figure 1: PK join algorithm example
  • Figure 2: Expansion algorithm example with $M=18$
  • Figure 3: Join algorithm example
  • Figure 4: Computation time of $\mathtt{OPartition}$ varying input size $N$, number of servers $p$, or security parameter $\sigma$.
  • Figure 5: Total time of join, where the labels on top of the bars or adjacent to the data points are communication costs (GB).

Theorems & Definitions (11)

  • Definition 1
  • Definition 2
  • Theorem 1
  • proof
  • Example 1
  • Theorem 2
  • proof
  • Example 2
  • Theorem 3
  • proof
  • ...and 1 more