Table of Contents
Fetching ...

Privacy in Cloud Computing through Immersion-based Coding

Haleh Hayati, Nathan van de Wouw, Carlos Murguia

TL;DR

The paper tackles privacy in cloud-based dynamic algorithms by introducing immersion-based coding, which jointly designs a data-encoding map, a higher-dimensional target algorithm, and a decode step to recover true utility. It constructs a prescriptive, affine, random-coding framework that immerses trajectories of the original algorithm into a larger system, enabling encoded processing with differential privacy guarantees and no loss in algorithmic utility. The main contributions include a concrete affine immersion design, a DP guarantee with per-element bounds, a two-time-scale extension, and two case studies in optimization/learning and networked control that demonstrate privacy preservation with negligible performance overhead. The approach offers a scalable, reals-based alternative to cryptographic privacy methods, balancing privacy and utility while maintaining practical computation and communication efficiency for large-scale dynamic systems.

Abstract

Cloud computing enables users to process and store data remotely on high-performance computers and servers by sharing data over the Internet. However, transferring data to clouds causes unavoidable privacy concerns. Here, we present a synthesis framework to design coding mechanisms that allow sharing and processing data in a privacy-preserving manner without sacrificing data utility and algorithmic performance. We consider the setup where the user aims to run an algorithm in the cloud using private data. The cloud then returns some data utility back to the user (utility refers to the service that the algorithm provides, e.g., classification, prediction, AI models, etc.). To avoid privacy concerns, the proposed scheme provides tools to co-design: 1) coding mechanisms to distort the original data and guarantee a prescribed differential privacy level; 2) an equivalent-but-different algorithm (referred here to as the target algorithm) that runs on distorted data and produces distorted utility; and 3) a decoding function that extracts the true utility from the distorted one with a negligible error. Then, instead of sharing the original data and algorithm with the cloud, only the distorted data and target algorithm are disclosed, thereby avoiding privacy concerns. The proposed scheme is built on the synergy of differential privacy and system immersion tools from control theory. The key underlying idea is to design a higher-dimensional target algorithm that embeds all trajectories of the original algorithm and works on randomly encoded data to produce randomly encoded utility. We show that the proposed scheme can be designed to offer any level of differential privacy without degrading the algorithm's utility. We present two use cases to illustrate the performance of the developed tools: privacy in optimization/learning algorithms and a nonlinear networked control system.

Privacy in Cloud Computing through Immersion-based Coding

TL;DR

The paper tackles privacy in cloud-based dynamic algorithms by introducing immersion-based coding, which jointly designs a data-encoding map, a higher-dimensional target algorithm, and a decode step to recover true utility. It constructs a prescriptive, affine, random-coding framework that immerses trajectories of the original algorithm into a larger system, enabling encoded processing with differential privacy guarantees and no loss in algorithmic utility. The main contributions include a concrete affine immersion design, a DP guarantee with per-element bounds, a two-time-scale extension, and two case studies in optimization/learning and networked control that demonstrate privacy preservation with negligible performance overhead. The approach offers a scalable, reals-based alternative to cryptographic privacy methods, balancing privacy and utility while maintaining practical computation and communication efficiency for large-scale dynamic systems.

Abstract

Cloud computing enables users to process and store data remotely on high-performance computers and servers by sharing data over the Internet. However, transferring data to clouds causes unavoidable privacy concerns. Here, we present a synthesis framework to design coding mechanisms that allow sharing and processing data in a privacy-preserving manner without sacrificing data utility and algorithmic performance. We consider the setup where the user aims to run an algorithm in the cloud using private data. The cloud then returns some data utility back to the user (utility refers to the service that the algorithm provides, e.g., classification, prediction, AI models, etc.). To avoid privacy concerns, the proposed scheme provides tools to co-design: 1) coding mechanisms to distort the original data and guarantee a prescribed differential privacy level; 2) an equivalent-but-different algorithm (referred here to as the target algorithm) that runs on distorted data and produces distorted utility; and 3) a decoding function that extracts the true utility from the distorted one with a negligible error. Then, instead of sharing the original data and algorithm with the cloud, only the distorted data and target algorithm are disclosed, thereby avoiding privacy concerns. The proposed scheme is built on the synergy of differential privacy and system immersion tools from control theory. The key underlying idea is to design a higher-dimensional target algorithm that embeds all trajectories of the original algorithm and works on randomly encoded data to produce randomly encoded utility. We show that the proposed scheme can be designed to offer any level of differential privacy without degrading the algorithm's utility. We present two use cases to illustrate the performance of the developed tools: privacy in optimization/learning algorithms and a nonlinear networked control system.
Paper Structure (21 sections, 3 theorems, 54 equations, 12 figures, 1 table)

This paper contains 21 sections, 3 theorems, 54 equations, 12 figures, 1 table.

Key Result

Proposition 1

(Solution to Problem problem1) For given full rank matrices $\Pi_1 \in \mathbb{R}^{\tilde{n}_y \times n_y}$, $\Pi_2 \in \mathbb{R}^{\tilde{n}_\zeta \times n_\zeta}$, $\Pi_3 \in \mathbb{R}^{\tilde{n}_u \times n_u}$, and $\Pi_4 \in \mathbb{R}^{\tilde{n}_u \times \tilde{n}_y}$, matrix $N_1 \in \mathbb{ target algorithm: and inverse function: provide a solution to Problem problem1.

Figures (12)

  • Figure 1: The schematic diagram of a networked dynamical algorithm (a) without privacy and (b) with immersion-based coding for privacy.
  • Figure 2: The comparison of one sample of the original MNIST database and its encoded format.
  • Figure 3: The comparison of the accuracy of CNN networks in each iteration of standard ML and SIML algorithms using SGD and Adam optimizers (for ML) and target SGD and target Adam optimizers (for SIML).
  • Figure 4: The comparison of the accuracy of standard ML, SIML with privacy level ($\epsilon=1e-13$), and DP-SGD for various privacy levels $\epsilon=0.1,0.6,2,10$ and $\delta=1e-08$.
  • Figure 5: The comparison of the training time of standard ML, SIML, and DP-SGD.
  • ...and 7 more figures

Theorems & Definitions (10)

  • Remark 1
  • Proposition 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Corollary 1
  • Definition 1
  • Definition 2
  • Definition 3: Sensitivity
  • Theorem 1