Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing

Mohammad Malekzadeh; Fahim Kawsar

Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing

Mohammad Malekzadeh, Fahim Kawsar

TL;DR

This work tackles the privacy gap in split inference by protecting not only input data but also the semantic interpretation of DNN outputs. It introduces Salted DNNs, which embed a secret salt into an early layer to permute output semantics via a mapping, enabling the client to control the meaning of predictions without sacrificing accuracy or efficiency. Empirical results on CIFAR10 and PAMAP2 with LeNet, WideResNet, and ConvNet show the Salted DNNs achieve accuracy close to standard models when the salted layer is placed early, while adding only a small architectural overhead and maintaining split-inference practicality. The approach, supported by open-source code, provides a versatile, general mechanism for output privacy in edge-cloud ML deployments, with identified directions for future formal privacy guarantees and scalability to larger class sets.

Abstract

In split inference, a deep neural network (DNN) is partitioned to run the early part of the DNN at the edge and the later part of the DNN in the cloud. This meets two key requirements for on-device machine learning: input privacy and computation efficiency. Still, an open question in split inference is output privacy, given that the outputs of the DNN are observable in the cloud. While encrypted computing can protect output privacy too, homomorphic encryption requires substantial computation and communication resources from both edge and cloud devices. In this paper, we introduce Salted DNNs: a novel approach that enables clients at the edge, who run the early part of the DNN, to control the semantic interpretation of the DNN's outputs at inference time. Our proposed Salted DNNs maintain classification accuracy and computation efficiency very close to the standard DNN counterparts. Experimental evaluations conducted on both images and wearable sensor data demonstrate that Salted DNNs attain classification accuracy very close to standard DNNs, particularly when the Salted Layer is positioned within the early part to meet the requirements of split inference. Our approach is general and can be applied to various types of DNNs. As a benchmark for future studies, we open-source our code.

Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing

TL;DR

Abstract

Paper Structure (13 sections, 4 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 13 sections, 4 equations, 2 figures, 1 table, 2 algorithms.

Introduction
Background and Objective
Our Methodology
Salted DNNs
Training and Inference Algorithms
Threat Model
Evaluation
Experimental Set-up
Salted Layer Implementation
Evaluation Criteria
Results
Limitations and Future work
Conclusion

Figures (2)

Figure 1: The overview of our salted DNN for efficient and private split inference. The early part of DNN $\theta^1$ processes input data $X$ on the client's device. After processing up to the cut layer, the intermediate representations $Z=\theta^1(X)$ are transmitted to the server for further computation using the later part of the DNN. The DNN's outputs $Y=\theta^2(Z)$ are transmitted back to the client. The client controls the semantic arrangement of the outputs through the selected salt value $s$, making it the only party capable of decoding the meaning of the outputs.
Figure 2: A visual representation of a Salted Layer. A transposed convolutional layer expands the chosen salt $s$ into an output that matches the dimension of the salted layer.

Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing

TL;DR

Abstract

Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing

Authors

TL;DR

Abstract

Table of Contents

Figures (2)