InstaHide's Sample Complexity When Mixing Two Private Images
Baihe Huang, Zhao Song, Runzhou Tao, Junze Yin, Ruizhe Zhang, Danyang Zhuo
TL;DR
This work analyzes InstaHide privacy protections in a realistic two-private-image mixing setting and introduces a unified framework to compare attacks. It presents a new algorithm that recovers all private images with sample complexity $m = \Omega(n_{{\mathsf{priv}}} \log n_{{\mathsf{priv}}})$ for $k_{{\mathsf{priv}}}=2$, achieving near-optimal sample efficiency but with exponential-time dependence on $m$; the running time combines $O(m^2 d)$, SDP-based public-identifier steps, and a $2^{m}$ factor in the final regression stage. The authors also establish a computational hardness result via a reduction from MAX-CUT, showing NP-hardness of approximating the hidden-signs regression problem under common complexity assumptions, while proving that the scheme is not information-theoretically secure. Collectively, the results separate information-theoretic limits from computational feasibility, guiding the design and evaluation of privacy-preserving learning methods for sensitive data and informing future attacks and defenses in InstaHide-like systems.
Abstract
Training neural networks usually require large numbers of sensitive training data, and how to protect the privacy of training data has thus become a critical topic in deep learning research. InstaHide is a state-of-the-art scheme to protect training data privacy with only minor effects on test accuracy, and its security has become a salient question. In this paper, we systematically study recent attacks on InstaHide and present a unified framework to understand and analyze these attacks. We find that existing attacks either do not have a provable guarantee or can only recover a single private image. On the current InstaHide challenge setup, where each InstaHide image is a mixture of two private images, we present a new algorithm to recover all the private images with a provable guarantee and optimal sample complexity. In addition, we also provide a computational hardness result on retrieving all InstaHide images. Our results demonstrate that InstaHide is not information-theoretically secure but computationally secure in the worst case, even when mixing two private images.
