Rethinking Membership Inference Attacks Against Transfer Learning

Cong Wu; Jing Chen; Qianru Fang; Kun He; Ziming Zhao; Hao Ren; Guowen Xu; Yang Liu; Yang Xiang

Rethinking Membership Inference Attacks Against Transfer Learning

Cong Wu, Jing Chen, Qianru Fang, Kun He, Ziming Zhao, Hao Ren, Guowen Xu, Yang Liu, Yang Xiang

TL;DR

The paper identifies a new white-box membership inference vector in transfer learning that leverages differential hidden-layer representations between teacher and student models, using shadow models and adaptive thresholds to infer teacher data membership from the student. It introduces a three-class attack variant and compares against state-of-the-art attacks, demonstrating higher attack accuracy across four datasets and multiple architectures. The work highlights the heightened privacy risk in transfer learning and provides concrete attack-architecture details, evaluation results, and initial defense considerations, emphasizing the need for tailored privacy safeguards in teacher–student transfer settings. Overall, it expands the understanding of MIA threats in transfer learning by explicitly modeling teacher–student interdependencies and offering practical insights for strengthening privacy protections in such systems.

Abstract

Transfer learning, successful in knowledge translation across related tasks, faces a substantial privacy threat from membership inference attacks (MIAs). These attacks, despite posing significant risk to ML model's training data, remain limited-explored in transfer learning. The interaction between teacher and student models in transfer learning has not been thoroughly explored in MIAs, potentially resulting in an under-examined aspect of privacy vulnerabilities within transfer learning. In this paper, we propose a new MIA vector against transfer learning, to determine whether a specific data point was used to train the teacher model while only accessing the student model in a white-box setting. Our method delves into the intricate relationship between teacher and student models, analyzing the discrepancies in hidden layer representations between the student model and its shadow counterpart. These identified differences are then adeptly utilized to refine the shadow model's training process and to inform membership inference decisions effectively. Our method, evaluated across four datasets in diverse transfer learning tasks, reveals that even when an attacker only has access to the student model, the teacher model's training data remains susceptible to MIAs. We believe our work unveils the unexplored risk of membership inference in transfer learning.

Rethinking Membership Inference Attacks Against Transfer Learning

TL;DR

Abstract

Rethinking Membership Inference Attacks Against Transfer Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)