Table of Contents
Fetching ...

InstaHide's Sample Complexity When Mixing Two Private Images

Baihe Huang, Zhao Song, Runzhou Tao, Junze Yin, Ruizhe Zhang, Danyang Zhuo

TL;DR

This work analyzes InstaHide privacy protections in a realistic two-private-image mixing setting and introduces a unified framework to compare attacks. It presents a new algorithm that recovers all private images with sample complexity $m = \Omega(n_{{\mathsf{priv}}} \log n_{{\mathsf{priv}}})$ for $k_{{\mathsf{priv}}}=2$, achieving near-optimal sample efficiency but with exponential-time dependence on $m$; the running time combines $O(m^2 d)$, SDP-based public-identifier steps, and a $2^{m}$ factor in the final regression stage. The authors also establish a computational hardness result via a reduction from MAX-CUT, showing NP-hardness of approximating the hidden-signs regression problem under common complexity assumptions, while proving that the scheme is not information-theoretically secure. Collectively, the results separate information-theoretic limits from computational feasibility, guiding the design and evaluation of privacy-preserving learning methods for sensitive data and informing future attacks and defenses in InstaHide-like systems.

Abstract

Training neural networks usually require large numbers of sensitive training data, and how to protect the privacy of training data has thus become a critical topic in deep learning research. InstaHide is a state-of-the-art scheme to protect training data privacy with only minor effects on test accuracy, and its security has become a salient question. In this paper, we systematically study recent attacks on InstaHide and present a unified framework to understand and analyze these attacks. We find that existing attacks either do not have a provable guarantee or can only recover a single private image. On the current InstaHide challenge setup, where each InstaHide image is a mixture of two private images, we present a new algorithm to recover all the private images with a provable guarantee and optimal sample complexity. In addition, we also provide a computational hardness result on retrieving all InstaHide images. Our results demonstrate that InstaHide is not information-theoretically secure but computationally secure in the worst case, even when mixing two private images.

InstaHide's Sample Complexity When Mixing Two Private Images

TL;DR

This work analyzes InstaHide privacy protections in a realistic two-private-image mixing setting and introduces a unified framework to compare attacks. It presents a new algorithm that recovers all private images with sample complexity for , achieving near-optimal sample efficiency but with exponential-time dependence on ; the running time combines , SDP-based public-identifier steps, and a factor in the final regression stage. The authors also establish a computational hardness result via a reduction from MAX-CUT, showing NP-hardness of approximating the hidden-signs regression problem under common complexity assumptions, while proving that the scheme is not information-theoretically secure. Collectively, the results separate information-theoretic limits from computational feasibility, guiding the design and evaluation of privacy-preserving learning methods for sensitive data and informing future attacks and defenses in InstaHide-like systems.

Abstract

Training neural networks usually require large numbers of sensitive training data, and how to protect the privacy of training data has thus become a critical topic in deep learning research. InstaHide is a state-of-the-art scheme to protect training data privacy with only minor effects on test accuracy, and its security has become a salient question. In this paper, we systematically study recent attacks on InstaHide and present a unified framework to understand and analyze these attacks. We find that existing attacks either do not have a provable guarantee or can only recover a single private image. On the current InstaHide challenge setup, where each InstaHide image is a mixture of two private images, we present a new algorithm to recover all the private images with a provable guarantee and optimal sample complexity. In addition, we also provide a computational hardness result on retrieving all InstaHide images. Our results demonstrate that InstaHide is not information-theoretically secure but computationally secure in the worst case, even when mixing two private images.

Paper Structure

This paper contains 28 sections, 13 theorems, 44 equations, 3 figures, 1 table.

Key Result

Theorem 1.1

Let $k_{{\mathsf{priv}}} =2$. If there are $n_{{\mathsf{priv}}}$ private vectors and $n_{{{\mathsf{pub}}}}$ public vectors, each of which is an i.i.d. draw from $\mathcal{N}(0,\mathsf{Id}_d)$, then as long as there is some $m = O(n_{{\mathsf{priv}}} \log n_{{\mathsf{priv}}} )$ such that, given a sample of $m$ random synthetic vectors independently generated as above, one can exactly recover all t

Figures (3)

  • Figure 1: An example about cluster step in carlini_attack for $T = 2$ and $n_{\mathsf{priv}} = 4$. First, starting from each $\mathsf{InstaHide}$ image (top), the algorithm grows cluster $S_i$ with size $3$ (middle). Then, we use $K$-means for $K = 4$ to compute 4 groups $C_1, \ldots, C_4$ (bottom), these groups each correspond to one original image.
  • Figure 2: The construction of the graph for min-cost max flow. $c$ denotes the flow capacity of the edge, and $w$ denote the weight of the edge. The graph contains $T \cdot n_{{\mathsf{priv}}}$ nodes for each $\mathsf{InstaHide}$ images, $n_{{\mathsf{priv}}}$ nodes for each original images, a source and a terminal. There are three types of edges: i) (left) from the source to each $\mathsf{InstaHide}$ image node, with flow capacity $2$ and weight $0$; ii) (middle) from each $\mathsf{InstaHide}$ image node $i$ to each original image node $j$, with flow capacity $1$ and weight $\widetilde{{\bf{W}}}_{i,j}$; iii) (right) from each original image node to the terminal, with flow capacity $2T$ and weight $0$.
  • Figure 3: The result of solving the min-cost flow in Figure \ref{['fig:min_cost_flow']}. Each $\mathsf{InstaHide}$ image is assigned to two clusters, which ideally correspond to two original images. In reality, a cluster may not contain all $\mathsf{InstaHide}$ images that share the same original image.

Theorems & Definitions (28)

  • Theorem 1.1: Informal version of Theorem \ref{['thm:main_formal']}
  • Definition 2.1: Image matrix notation, Definition 2.2 in csz20
  • Definition 2.2: Public/private notation, Definition 2.3 in csz20
  • Definition 2.3: Synthetic images, Definition 2.4 in csz20
  • Definition 2.4: Gaussian images, Definition 2.5 in csz20
  • Definition 2.5: Distribution over selection vectors, Definition 2.6 in csz20
  • Definition 2.6: Public/private operators
  • Definition 2.7: Public and private components of image matrix and selection vectors
  • Theorem 3.1: Main result
  • Remark 3.2
  • ...and 18 more