Table of Contents
Fetching ...

AdaPI: Facilitating DNN Model Adaptivity for Efficient Private Inference in Edge Computing

Tong Zhou, Jiahui Zhao, Yukui Luo, Xi Xie, Wujie Wen, Caiwen Ding, Xiaolin Xu

TL;DR

AdaPI tackles the problem of private inference on edge devices with varying energy budgets by learning a single DNN augmented with weight-level and feature-level soft masks, which are converted into nested binary masks to adapt computation and communication workloads without retraining per budget. It introduces a triple optimization objective, a soft-mask framework with an indicator function, and a sequential multi-mask training strategy that preserves accuracy across budgets via STE and knowledge distillation. The paper also develops a unified latency-based metric bridging MACs to ReLU counts and provides detailed 2PC-Conv and 2PC-ReLU latency models to compare PI approaches fairly. Empirically, AdaPI demonstrates strong performance, achieving up to 7.3% higher accuracy on CIFAR-100 over SOTA PI methods and enabling efficient PI across diverse energy budgets on CIFAR-10/100 and Tiny-ImageNet, with code released for reproducibility.

Abstract

Private inference (PI) has emerged as a promising solution to execute computations on encrypted data, safeguarding user privacy and model parameters in edge computing. However, existing PI methods are predominantly developed considering constant resource constraints, overlooking the varied and dynamic resource constraints in diverse edge devices, like energy budgets. Consequently, model providers have to design specialized models for different devices, where all of them have to be stored on the edge server, resulting in inefficient deployment. To fill this gap, this work presents AdaPI, a novel approach that achieves adaptive PI by allowing a model to perform well across edge devices with diverse energy budgets. AdaPI employs a PI-aware training strategy that optimizes the model weights alongside weight-level and feature-level soft masks. These soft masks are subsequently transformed into multiple binary masks to enable adjustments in communication and computation workloads. Through sequentially training the model with increasingly dense binary masks, AdaPI attains optimal accuracy for each energy budget, which outperforms the state-of-the-art PI methods by 7.3\% in terms of test accuracy on CIFAR-100. The code of AdaPI can be accessed via https://github.com/jiahuiiiiii/AdaPI.

AdaPI: Facilitating DNN Model Adaptivity for Efficient Private Inference in Edge Computing

TL;DR

AdaPI tackles the problem of private inference on edge devices with varying energy budgets by learning a single DNN augmented with weight-level and feature-level soft masks, which are converted into nested binary masks to adapt computation and communication workloads without retraining per budget. It introduces a triple optimization objective, a soft-mask framework with an indicator function, and a sequential multi-mask training strategy that preserves accuracy across budgets via STE and knowledge distillation. The paper also develops a unified latency-based metric bridging MACs to ReLU counts and provides detailed 2PC-Conv and 2PC-ReLU latency models to compare PI approaches fairly. Empirically, AdaPI demonstrates strong performance, achieving up to 7.3% higher accuracy on CIFAR-100 over SOTA PI methods and enabling efficient PI across diverse energy budgets on CIFAR-10/100 and Tiny-ImageNet, with code released for reproducibility.

Abstract

Private inference (PI) has emerged as a promising solution to execute computations on encrypted data, safeguarding user privacy and model parameters in edge computing. However, existing PI methods are predominantly developed considering constant resource constraints, overlooking the varied and dynamic resource constraints in diverse edge devices, like energy budgets. Consequently, model providers have to design specialized models for different devices, where all of them have to be stored on the edge server, resulting in inefficient deployment. To fill this gap, this work presents AdaPI, a novel approach that achieves adaptive PI by allowing a model to perform well across edge devices with diverse energy budgets. AdaPI employs a PI-aware training strategy that optimizes the model weights alongside weight-level and feature-level soft masks. These soft masks are subsequently transformed into multiple binary masks to enable adjustments in communication and computation workloads. Through sequentially training the model with increasingly dense binary masks, AdaPI attains optimal accuracy for each energy budget, which outperforms the state-of-the-art PI methods by 7.3\% in terms of test accuracy on CIFAR-100. The code of AdaPI can be accessed via https://github.com/jiahuiiiiii/AdaPI.
Paper Structure (21 sections, 18 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 18 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: The inference process for AdaPI. For the edge device with low energy budget, the weight and feature masks with low density are chosen for the server-side model. The masked model is also encrypted and installed on the edge device. The inference process is divided between the edge server and the device, ensuring user privacy and model confidentiality.
  • Figure 2: Overview of AdaPI with ResNet-18. The process involves the following steps: (1) Calculation of the loss using Eq. \ref{['eq:problem']}. (2) Optimization of weights and soft masks with the lowest weights/ReLU budgets through a triple optimization problem using the backpropagation method. (3) Conversion of soft masks to binary masks (where L1-L4 represent different levels of energy budgets, the white region in masks indicating weights/ReLU operations are preserved). (4) Sequential training with masks from low density (associated with L4) to high density (associated with L1)
  • Figure 3: For each dataset, AdaPI achieves Pareto frontiers of the normalized ReLU count vs. test accuracy with a single set of weights, with the density level from left to right being L4, L3, L2, and L1. In contrast, SOTA methods design specialized models for every scenario. AdaPI_rs18 and AdaPI_ws_22_8 denote AdaPI on ResNet-18 and WideResNet-22-8, respectively.
  • Figure 4: Performance comparison of AdaPI-single with AdaPI and SNL on CIFAR-100.