Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing
Mohammad Malekzadeh, Fahim Kawsar
TL;DR
This work tackles the privacy gap in split inference by protecting not only input data but also the semantic interpretation of DNN outputs. It introduces Salted DNNs, which embed a secret salt into an early layer to permute output semantics via a mapping, enabling the client to control the meaning of predictions without sacrificing accuracy or efficiency. Empirical results on CIFAR10 and PAMAP2 with LeNet, WideResNet, and ConvNet show the Salted DNNs achieve accuracy close to standard models when the salted layer is placed early, while adding only a small architectural overhead and maintaining split-inference practicality. The approach, supported by open-source code, provides a versatile, general mechanism for output privacy in edge-cloud ML deployments, with identified directions for future formal privacy guarantees and scalability to larger class sets.
Abstract
In split inference, a deep neural network (DNN) is partitioned to run the early part of the DNN at the edge and the later part of the DNN in the cloud. This meets two key requirements for on-device machine learning: input privacy and computation efficiency. Still, an open question in split inference is output privacy, given that the outputs of the DNN are observable in the cloud. While encrypted computing can protect output privacy too, homomorphic encryption requires substantial computation and communication resources from both edge and cloud devices. In this paper, we introduce Salted DNNs: a novel approach that enables clients at the edge, who run the early part of the DNN, to control the semantic interpretation of the DNN's outputs at inference time. Our proposed Salted DNNs maintain classification accuracy and computation efficiency very close to the standard DNN counterparts. Experimental evaluations conducted on both images and wearable sensor data demonstrate that Salted DNNs attain classification accuracy very close to standard DNNs, particularly when the Salted Layer is positioned within the early part to meet the requirements of split inference. Our approach is general and can be applied to various types of DNNs. As a benchmark for future studies, we open-source our code.
