CryptPEFT: Efficient and Private Neural Network Inference via Parameter-Efficient Fine-Tuning

Saisai Xia; Wenhao Wang; Zihao Wang; Yuhui Zhang; Yier Jin; Dan Meng; Rui Hou

CryptPEFT: Efficient and Private Neural Network Inference via Parameter-Efficient Fine-Tuning

Saisai Xia, Wenhao Wang, Zihao Wang, Yuhui Zhang, Yier Jin, Dan Meng, Rui Hou

TL;DR

CryptPEFT tackles the privacy-utility trade-off in PEFT-based inference by introducing a private-inference architecture with one-way communication (OWC) that confines encrypted computation to adapters. It designs an MPC-friendly adapter—LinAtten with low-rank LoRA integration and ReLU activations—and uses NAS to automatically select adapter configurations that meet a target utility under network-specific latency. Empirical results on Vision Transformer backbones across CIFAR-10/100, SVHN, Food-101, and Flowers-102 show substantial private-inference speedups ($20.62\times$ to $291.48\times$) and competitive accuracy (e.g., CIFAR-100 at $85.47\%$ with $2.26$ s latency), outperforming MPCViT and PEFT baselines. The approach demonstrates strong practicality for privacy-preserving deployment and points to future work extending to LLMs and multi-modal models.

Abstract

Publicly available large pretrained models (i.e., backbones) and lightweight adapters for parameter-efficient fine-tuning (PEFT) have become standard components in modern machine learning pipelines. However, preserving the privacy of both user inputs and fine-tuned adapters -- often trained on sensitive data -- during inference remains a significant challenge. Applying cryptographic techniques, such as multi-party computation (MPC), to PEFT settings still incurs substantial encrypted computation across both the backbone and adapter, mainly due to the inherent two-way communication between them. To address this limitation, we propose CryptPEFT, the first PEFT solution specifically designed for private inference scenarios. CryptPEFT introduces a novel one-way communication (OWC) architecture that confines encrypted computation solely to the adapter, significantly reducing both computational and communication overhead. To maintain strong model utility under this constraint, we explore the design space of OWC-compatible adapters and employ an automated architecture search algorithm to optimize the trade-off between private inference efficiency and model utility. We evaluated CryptPEFT using Vision Transformer backbones across widely used image classification datasets. Our results show that CryptPEFT significantly outperforms existing baselines, delivering speedups ranging from $20.62\times$ to $291.48\times$ in simulated wide-area network (WAN) and local-area network (LAN) settings. On CIFAR-100, CryptPEFT attains 85.47% accuracy with just 2.26 seconds of inference latency. These findings demonstrate that CryptPEFT offers an efficient and privacy-preserving solution for modern PEFT-based inference.

CryptPEFT: Efficient and Private Neural Network Inference via Parameter-Efficient Fine-Tuning

TL;DR

Abstract

CryptPEFT: Efficient and Private Neural Network Inference via Parameter-Efficient Fine-Tuning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)