Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment
Yufan Liu, Wanqian Zhang, Dayan Wu, Zheng Lin, Jingzi Gu, Weiping Wang
TL;DR
This paper tackles black-box model inversion for face data by shifting from costly optimization in input space to training-based inversion. It introduces the Prediction-to-Image (P2I) framework, combining a Prediction Alignment Encoder with a fixed StyleGAN generator to map prediction vectors into the disentangled $\mathcal{W}^+$ latent space, enabling one-pass reconstruction of target identities. A key innovation is the aligned ensemble attack, which aggregates latent codes from multiple public images to capture complementary facial attributes, boosting reconstruction quality with far fewer queries than prior methods. Across multiple datasets and target-model architectures, P2I achieves higher attack accuracy and perceptual quality while dramatically reducing query counts, demonstrating a practical and potent privacy risk in black-box settings.
Abstract
Model inversion (MI) attack reconstructs the private training data of a target model given its output, posing a significant threat to deep learning models and data privacy. On one hand, most of existing MI methods focus on searching for latent codes to represent the target identity, yet this iterative optimization-based scheme consumes a huge number of queries to the target model, making it unrealistic especially in black-box scenario. On the other hand, some training-based methods launch an attack through a single forward inference, whereas failing to directly learn high-level mappings from prediction vectors to images. Addressing these limitations, we propose a novel Prediction-to-Image (P2I) method for black-box MI attack. Specifically, we introduce the Prediction Alignment Encoder to map the target model's output prediction into the latent code of StyleGAN. In this way, prediction vector space can be well aligned with the more disentangled latent space, thus establishing a connection between prediction vectors and the semantic facial features. During the attack phase, we further design the Aligned Ensemble Attack scheme to integrate complementary facial attributes of target identity for better reconstruction. Experimental results show that our method outperforms other SOTAs, e.g.,compared with RLB-MI, our method improves attack accuracy by 8.5% and reduces query numbers by 99% on dataset CelebA.
