Model Inversion Robustness: Can Transfer Learning Help?
Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran, Ngoc-Bao Nguyen, Ngai-Man Cheung
TL;DR
This work introduces TL-DMI, a simple, transfer-learning–based defense against model inversion that restricts private-data leakage by freezing early layers and only fine-tuning the last few layers ($| heta_C|$) during private training. A novel Fisher Information analysis shows that early layers are more critical for MI reconstruction, while later layers align with the classification task, justifying the design. Empirical results across 20 MI setups, 9 architectures, and multiple attacks demonstrate state-of-the-art MI robustness with modest utility loss, and TL-DMI can be combined with existing defenses like BiDO for further gains. The method is architecture-agnostic, easy to implement, and broadly applicable to both CNNs and vision transformers, highlighting a practical path toward privacy-preserving model deployment without heavy regularization trade-offs.
Abstract
Model Inversion (MI) attacks aim to reconstruct private training data by abusing access to machine learning models. Contemporary MI attacks have achieved impressive attack performance, posing serious threats to privacy. Meanwhile, all existing MI defense methods rely on regularization that is in direct conflict with the training objective, resulting in noticeable degradation in model utility. In this work, we take a different perspective, and propose a novel and simple Transfer Learning-based Defense against Model Inversion (TL-DMI) to render MI-robust models. Particularly, by leveraging TL, we limit the number of layers encoding sensitive information from private training dataset, thereby degrading the performance of MI attack. We conduct an analysis using Fisher Information to justify our method. Our defense is remarkably simple to implement. Without bells and whistles, we show in extensive experiments that TL-DMI achieves state-of-the-art (SOTA) MI robustness. Our code, pre-trained models, demo and inverted data are available at: https://hosytuyen.github.io/projects/TL-DMI
