Spin: An Efficient Secure Computation Framework with GPU Acceleration
Wuxuan Jiang, Xiangjun Song, Shenbai Hong, Haijun Zhang, Wenxin Liu, Bo Zhao, Wei Xu, Yi Li
TL;DR
Spin addresses the tension between privacy and efficiency in multi-party computation by delivering a GPU-accelerated framework that supports dishonest-majority security. It introduces novel nonlinear-function algorithms (reciprocal, exponentiation, logarithm) and attention-specific optimizations to accelerate secure CNN training and Transformer inference, backed by RDMA-enabled communication and CPU-GPU hybrid execution. Empirical results show up to $2\times$ speedups in training, improved accuracy relative to state-of-the-art, and substantial efficiency gains in Transformer inference for models with tens of millions of parameters. The work demonstrates practical viability of secure training and inference for large neural networks, with concrete hardware-aware optimizations and a pathway to scaling to larger models and clusters.
Abstract
Accuracy and efficiency remain challenges for multi-party computation (MPC) frameworks. Spin is a GPU-accelerated MPC framework that supports multiple computation parties and a dishonest majority adversarial setup. We propose optimized protocols for non-linear functions that are critical for machine learning, as well as several novel optimizations specific to attention that is the fundamental unit of Transformer models, allowing Spin to perform non-trivial CNNs training and Transformer inference without sacrificing security. At the backend level, Spin leverages GPU, CPU, and RDMA-enabled smart network cards for acceleration. Comprehensive evaluations demonstrate that Spin can be up to $2\times$ faster than the state-of-the-art for deep neural network training. For inference on a Transformer model with 18.9 million parameters, our attention-specific optimizations enable Spin to achieve better efficiency, less communication, and better accuracy.
