DeepShare: Sharing ReLU Across Channels and Layers for Efficient Private Inference

Yonathan Bornfeld; Shai Avidan

DeepShare: Sharing ReLU Across Channels and Layers for Efficient Private Inference

Yonathan Bornfeld, Shai Avidan

TL;DR

DeepShare introduces DReLU-based sharing of nonlinear gates to dramatically reduce ReLU computations in Private Inference without sacrificing accuracy. By partitioning channels into prototype and replicate groups and extending sharing across layers, it achieves strong Pareto-frontier performance on CIFAR-100 with ResNet-18 and state-of-the-art results on segmentation, while maintaining cryptographic PI practicality. The authors also provide a theoretical construction showing that a single DReLU can express complex decision boundaries, addressing expressiveness concerns raised by prior ReLU-pruning methods. The approach relies on a GELU-to-ReLU transitional training protocol to enable gradient flow and uses affine gate transformations to maintain expressiveness, offering a practical path to more scalable private inference systems.

Abstract

Private Inference (PI) uses cryptographic primitives to perform privacy preserving machine learning. In this setting, the owner of the network runs inference on the data of the client without learning anything about the data and without revealing any information about the model. It has been observed that a major computational bottleneck of PI is the calculation of the gate (i.e., ReLU), so a considerable amount of effort have been devoted to reducing the number of ReLUs in a given network. We focus on the DReLU, which is the non-linear step function of the ReLU and show that one DReLU can serve many ReLU operations. We suggest a new activation module where the DReLU operation is only performed on a subset of the channels (Prototype channels), while the rest of the channels (replicate channels) replicates the DReLU of each of their neurons from the corresponding neurons in one of the prototype channels. We then extend this idea to work across different layers. We show that this formulation can drastically reduce the number of DReLU operations in resnet type network. Furthermore, our theoretical analysis shows that this new formulation can solve an extended version of the XOR problem, using just one non-linearity and two neurons, something that traditional formulations and some PI specific methods cannot achieve. We achieve new SOTA results on several classification setups, and achieve SOTA results on image segmentation.

DeepShare: Sharing ReLU Across Channels and Layers for Efficient Private Inference

TL;DR

Abstract

DeepShare: Sharing ReLU Across Channels and Layers for Efficient Private Inference

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (2)