Table of Contents
Fetching ...

Securing Neural Networks with Knapsack Optimization

Yakir Gorski, Amir Jevnisek, Shai Avidan

TL;DR

This paper tackles Private Inference for neural networks by targeting the dominant cost in secure computation: non-linear ReLUs. It introduces Block-ReLU, a patch-based activation that replaces many per-pixel DReLUs with a single DReLU computed on a neighborhood, and selects optimal patch sizes per channel via a Multiple-Choice Knapsack optimization, guided by a distortion proxy. The method is evaluated in a semi-honest 3-party SecureNN setting across CIFAR-100, ImageNet, and semantic segmentation benchmarks, achieving competitive accuracy with substantial reductions in runtime and bandwidth, including the first secure semantic segmentation results. The work also includes approximate DReLU and architectural adjustments (e.g., MaxPool to AveragePool, ReLU6 to ReLU) to further accelerate secure inference, and provides open-source tooling for integration with OpenMMLab models.

Abstract

MLaaS Service Providers (SPs) holding a Neural Network would like to keep the Neural Network weights secret. On the other hand, users wish to utilize the SPs' Neural Network for inference without revealing their data. Multi-Party Computation (MPC) offers a solution to achieve this. Computations in MPC involve communication, as the parties send data back and forth. Non-linear operations are usually the main bottleneck requiring the bulk of communication bandwidth. In this paper, we focus on ResNets, which serve as the backbone for many Computer Vision tasks, and we aim to reduce their non-linear components, specifically, the number of ReLUs. Our key insight is that spatially close pixels exhibit correlated ReLU responses. Building on this insight, we replace the per-pixel ReLU operation with a ReLU operation per patch. We term this approach 'Block-ReLU'. Since different layers in a Neural Network correspond to different feature hierarchies, it makes sense to allow patch-size flexibility for the various layers of the Neural Network. We devise an algorithm to choose the optimal set of patch sizes through a novel reduction of the problem to the Knapsack Problem. We demonstrate our approach in the semi-honest secure 3-party setting for four problems: Classifying ImageNet using ResNet50 backbone, classifying CIFAR100 using ResNet18 backbone, Semantic Segmentation of ADE20K using MobileNetV2 backbone, and Semantic Segmentation of Pascal VOC 2012 using ResNet50 backbone. Our approach achieves competitive performance compared to a handful of competitors. Our source code is publicly available: https://github.com/yg320/secure_inference.

Securing Neural Networks with Knapsack Optimization

TL;DR

This paper tackles Private Inference for neural networks by targeting the dominant cost in secure computation: non-linear ReLUs. It introduces Block-ReLU, a patch-based activation that replaces many per-pixel DReLUs with a single DReLU computed on a neighborhood, and selects optimal patch sizes per channel via a Multiple-Choice Knapsack optimization, guided by a distortion proxy. The method is evaluated in a semi-honest 3-party SecureNN setting across CIFAR-100, ImageNet, and semantic segmentation benchmarks, achieving competitive accuracy with substantial reductions in runtime and bandwidth, including the first secure semantic segmentation results. The work also includes approximate DReLU and architectural adjustments (e.g., MaxPool to AveragePool, ReLU6 to ReLU) to further accelerate secure inference, and provides open-source tooling for integration with OpenMMLab models.

Abstract

MLaaS Service Providers (SPs) holding a Neural Network would like to keep the Neural Network weights secret. On the other hand, users wish to utilize the SPs' Neural Network for inference without revealing their data. Multi-Party Computation (MPC) offers a solution to achieve this. Computations in MPC involve communication, as the parties send data back and forth. Non-linear operations are usually the main bottleneck requiring the bulk of communication bandwidth. In this paper, we focus on ResNets, which serve as the backbone for many Computer Vision tasks, and we aim to reduce their non-linear components, specifically, the number of ReLUs. Our key insight is that spatially close pixels exhibit correlated ReLU responses. Building on this insight, we replace the per-pixel ReLU operation with a ReLU operation per patch. We term this approach 'Block-ReLU'. Since different layers in a Neural Network correspond to different feature hierarchies, it makes sense to allow patch-size flexibility for the various layers of the Neural Network. We devise an algorithm to choose the optimal set of patch sizes through a novel reduction of the problem to the Knapsack Problem. We demonstrate our approach in the semi-honest secure 3-party setting for four problems: Classifying ImageNet using ResNet50 backbone, classifying CIFAR100 using ResNet18 backbone, Semantic Segmentation of ADE20K using MobileNetV2 backbone, and Semantic Segmentation of Pascal VOC 2012 using ResNet50 backbone. Our approach achieves competitive performance compared to a handful of competitors. Our source code is publicly available: https://github.com/yg320/secure_inference.
Paper Structure (23 sections, 7 equations, 4 figures, 3 tables)

This paper contains 23 sections, 7 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Task Performance vs DReLU budget: (a) Classification results for CIFAR-100 and ImageNet. We show competitive results with respect to the alternatives. (b) Semantic Segmentation results. There are no published results for Secure Semantic Segmentation to compare against.
  • Figure 2: Activation Statistics for MobileNetV2: The probability that two activation units in the same channel have the same sign, based on their spatial distance. The lowest probability is about $0.7=70\%$.
  • Figure 3: Block-ReLU (bReLU): Left: $6 \times 6$ channel input, Right: bReLU output with a patch-size of $2 \times 3$. We convert $36$ DReLU operations to a mere $6$ operations, with the cost of $12$ sign flips.
  • Figure 4: Secure Inference: Runtime (a) and Bandwidth consumption (b) improvement are depicted vs the task performance operating point. The baseline model comprises the pre-trained version with all ReLUs and full ReLU resolution. Our evaluation employs the SecureNN protocol, conducted on a cloud environment. Specific runtime and bandwidth consumption values are explicitly provided for operting points denoted with larger circles. Figures (c) and (d) show the contribution of each component to the Secure Inference runtime and bandwidth consumption metrics. Solid lines are evaluations with Approximate DReLU, while the dashed lines indicate the network with bReLUs evaluated without approximate DReLUs.