Table of Contents
Fetching ...

Exposing and Defending Membership Leakage in Vulnerability Prediction Models

Yihan Liao, Jacky Keung, Xiaoxue Ma, Jingyu Zhang, Yicheng Sun

TL;DR

This work demonstrates that vulnerability prediction models, including BiLSTM, BiGRU, and CodeBERT, are highly susceptible to membership inference attacks when outputs such as logits and loss are exposed. It introduces NMID, a lightweight, inference-time defense combining output masking and Gaussian noise to disrupt membership signals without retraining models. Through a shadow-based MIA methodology and extensive evaluation on SARD and NVD datasets, the study shows that NMID can substantially reduce attack success (AUC and F1) while preserving VP accuracy, with All-10 perturbations approaching random guessing. The findings emphasize the privacy risks in code-analysis AI systems and offer a practical defense for secure deployment of AI-powered software analysis tools.

Abstract

Neural models for vulnerability prediction (VP) have achieved impressive performance by learning from large-scale code repositories. However, their susceptibility to Membership Inference Attacks (MIAs), where adversaries aim to infer whether a particular code sample was used during training, poses serious privacy concerns. While MIA has been widely investigated in NLP and vision domains, its effects on security-critical code analysis tasks remain underexplored. In this work, we conduct the first comprehensive analysis of MIA on VP models, evaluating the attack success across various architectures (LSTM, BiGRU, and CodeBERT) and feature combinations, including embeddings, logits, loss, and confidence. Our threat model aligns with black-box and gray-box settings where prediction outputs are observable, allowing adversaries to infer membership by analyzing output discrepancies between training and non-training samples. The empirical findings reveal that logits and loss are the most informative and vulnerable outputs for membership leakage. Motivated by these observations, we propose a Noise-based Membership Inference Defense (NMID), which is a lightweight defense module that applies output masking and Gaussian noise injection to disrupt adversarial inference. Extensive experiments demonstrate that NMID significantly reduces MIA effectiveness, lowering the attack AUC from nearly 1.0 to below 0.65, while preserving the predictive utility of VP models. Our study highlights critical privacy risks in code analysis and offers actionable defense strategies for securing AI-powered software systems.

Exposing and Defending Membership Leakage in Vulnerability Prediction Models

TL;DR

This work demonstrates that vulnerability prediction models, including BiLSTM, BiGRU, and CodeBERT, are highly susceptible to membership inference attacks when outputs such as logits and loss are exposed. It introduces NMID, a lightweight, inference-time defense combining output masking and Gaussian noise to disrupt membership signals without retraining models. Through a shadow-based MIA methodology and extensive evaluation on SARD and NVD datasets, the study shows that NMID can substantially reduce attack success (AUC and F1) while preserving VP accuracy, with All-10 perturbations approaching random guessing. The findings emphasize the privacy risks in code-analysis AI systems and offer a practical defense for secure deployment of AI-powered software analysis tools.

Abstract

Neural models for vulnerability prediction (VP) have achieved impressive performance by learning from large-scale code repositories. However, their susceptibility to Membership Inference Attacks (MIAs), where adversaries aim to infer whether a particular code sample was used during training, poses serious privacy concerns. While MIA has been widely investigated in NLP and vision domains, its effects on security-critical code analysis tasks remain underexplored. In this work, we conduct the first comprehensive analysis of MIA on VP models, evaluating the attack success across various architectures (LSTM, BiGRU, and CodeBERT) and feature combinations, including embeddings, logits, loss, and confidence. Our threat model aligns with black-box and gray-box settings where prediction outputs are observable, allowing adversaries to infer membership by analyzing output discrepancies between training and non-training samples. The empirical findings reveal that logits and loss are the most informative and vulnerable outputs for membership leakage. Motivated by these observations, we propose a Noise-based Membership Inference Defense (NMID), which is a lightweight defense module that applies output masking and Gaussian noise injection to disrupt adversarial inference. Extensive experiments demonstrate that NMID significantly reduces MIA effectiveness, lowering the attack AUC from nearly 1.0 to below 0.65, while preserving the predictive utility of VP models. Our study highlights critical privacy risks in code analysis and offers actionable defense strategies for securing AI-powered software systems.

Paper Structure

This paper contains 31 sections, 2 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Overview of the Shadow-based MIA Pipeline. Solid arrows indicate the training stage, dashed arrows represent the evaluation stage.
  • Figure 2: Per-CWE vulnerable label ratios in shadow and target member sets.
  • Figure 3: Performance of target and shadow VP models across Accuracy, Precision, Recall, and F1-score.
  • Figure 4: Distribution of loss and logits for member (M) and non-member (NM) samples across BiLSTM, BiGRU, and CodeBERT.
  • Figure 5: t-SNE visualization of different feature sets (member vs. non-member) on the target testing set. The centroid distance is the average separability between the two classes.
  • ...and 2 more figures