GTree: GPU-Friendly Privacy-preserving Decision Tree Training and Inference
Qifan Wang, Shujie Cui, Lei Zhou, Ye Dong, Jianli Bai, Yun Sing Koh, Giovanni Russello
TL;DR
GTree presents the first GPU-accelerated, privacy-preserving decision-tree training and inference framework built on three-party replicated secret sharing. By encoding the DT and data as arrays and employing oblivious, GPU-friendly protocols (including Oblivious Array Access and layer-wise training), it hides data, tree structure, access patterns, and statistics. Empirical results show substantial speedups over CPU-based baselines (e.g., ≈11×–21× for training on SPECT/Adult) and strong inference performance for shallow trees (depth < 10), with robust security guarantees against semi-honest adversaries and enhanced protection over prior work. The work highlights practical viability of GPU-enabled privacy-preserving DTs and points to future improvements in ORAM-based access, continuous features, and broader data-type support.
Abstract
Decision tree (DT) is a widely used machine learning model due to its versatility, speed, and interpretability. However, for privacy-sensitive applications, outsourcing DT training and inference to cloud platforms raise concerns about data privacy. Researchers have developed privacy-preserving approaches for DT training and inference using cryptographic primitives, such as Secure Multi-Party Computation (MPC). While these approaches have shown progress, they still suffer from heavy computation and communication overheads. Few recent works employ Graphical Processing Units (GPU) to improve the performance of MPC-protected deep learning. This raises a natural question: \textit{can MPC-protected DT training and inference be accelerated by GPU?} We present GTree, the first scheme that uses GPU to accelerate MPC-protected secure DT training and inference. GTree is built across 3 parties who securely and jointly perform each step of DT training and inference with GPU. Each MPC protocol in GTree is designed in a GPU-friendly version. The performance evaluation shows that GTree achieves ${\thicksim}11{\times}$ and ${\thicksim}21{\times}$ improvements in training SPECT and Adult datasets, compared to the prior most efficient CPU-based work. For inference, GTree shows its superior efficiency when the DT has less than 10 levels, which is $126\times$ faster than the prior most efficient work when inferring $10^4$ instances with a tree of 7 levels. GTree also achieves a stronger security guarantee than prior solutions, which only leaks the tree depth and size of data samples while prior solutions also leak the tree structure. With \textit{oblivious array access}, the access pattern on GPU is also protected.
