Table of Contents
Fetching ...

CAMH: Advancing Model Hijacking Attack in Machine Learning

Xing He, Jiahao Chen, Yuwen Pu, Qingming Li, Chunyi Zhou, Yingcai Wu, Jinbao Li, Shouling Ji

TL;DR

CAMH introduces a category-agnostic model hijacking framework that overcomes class-count mismatches and data-distribution gaps while preserving the original model’s performance. It combines a Synchronized Optimization Layer, noise-alignment perturbations, and a dual-loop training scheme to enable effective hijacking under outsourcing and model-marketplace scenarios. Across MNIST, SVHN, GTSRB, CIFAR10, and CIFARm, CAMH achieves high camouflage (CR ≈ 1) and strong hijacking efficacy (ER typically > 0.85), even with limited hijacking data and when the hijacking task has more classes than the original. The results underscore potential security risks in third-party training and pre-trained-model ecosystems and motivate continued development of robust defenses and detection strategies.

Abstract

In the burgeoning domain of machine learning, the reliance on third-party services for model training and the adoption of pre-trained models have surged. However, this reliance introduces vulnerabilities to model hijacking attacks, where adversaries manipulate models to perform unintended tasks, leading to significant security and ethical concerns, like turning an ordinary image classifier into a tool for detecting faces in pornographic content, all without the model owner's knowledge. This paper introduces Category-Agnostic Model Hijacking (CAMH), a novel model hijacking attack method capable of addressing the challenges of class number mismatch, data distribution divergence, and performance balance between the original and hijacking tasks. CAMH incorporates synchronized training layers, random noise optimization, and a dual-loop optimization approach to ensure minimal impact on the original task's performance while effectively executing the hijacking task. We evaluate CAMH across multiple benchmark datasets and network architectures, demonstrating its potent attack effectiveness while ensuring minimal degradation in the performance of the original task.

CAMH: Advancing Model Hijacking Attack in Machine Learning

TL;DR

CAMH introduces a category-agnostic model hijacking framework that overcomes class-count mismatches and data-distribution gaps while preserving the original model’s performance. It combines a Synchronized Optimization Layer, noise-alignment perturbations, and a dual-loop training scheme to enable effective hijacking under outsourcing and model-marketplace scenarios. Across MNIST, SVHN, GTSRB, CIFAR10, and CIFARm, CAMH achieves high camouflage (CR ≈ 1) and strong hijacking efficacy (ER typically > 0.85), even with limited hijacking data and when the hijacking task has more classes than the original. The results underscore potential security risks in third-party training and pre-trained-model ecosystems and motivate continued development of robust defenses and detection strategies.

Abstract

In the burgeoning domain of machine learning, the reliance on third-party services for model training and the adoption of pre-trained models have surged. However, this reliance introduces vulnerabilities to model hijacking attacks, where adversaries manipulate models to perform unintended tasks, leading to significant security and ethical concerns, like turning an ordinary image classifier into a tool for detecting faces in pornographic content, all without the model owner's knowledge. This paper introduces Category-Agnostic Model Hijacking (CAMH), a novel model hijacking attack method capable of addressing the challenges of class number mismatch, data distribution divergence, and performance balance between the original and hijacking tasks. CAMH incorporates synchronized training layers, random noise optimization, and a dual-loop optimization approach to ensure minimal impact on the original task's performance while effectively executing the hijacking task. We evaluate CAMH across multiple benchmark datasets and network architectures, demonstrating its potent attack effectiveness while ensuring minimal degradation in the performance of the original task.
Paper Structure (31 sections, 5 equations, 21 figures, 3 tables, 1 algorithm)

This paper contains 31 sections, 5 equations, 21 figures, 3 tables, 1 algorithm.

Figures (21)

  • Figure 1: CAMH attacks explored in this paper.
  • Figure 4: The impact of the number of categories on the ER of hijacking tasks. The red line illustrates the relationship between the ER of the hijacking task and the value of m when the original task is CIFARm, the hijacking task is CIFAR10, and the CR remains above 95%. The blue line represents the situation when the original and hijacking tasks are swapped. We compared ResNet18 and ResNet34.
  • Figure 5: The impact of the percentage of data volume of hijacking tasks on the ER of hijacking tasks. The red line illustrates the relationship between the ER of the hijacking task and the value of m when the original task is MNIST, the hijacking task is SVHN, and the CR remains above 95%. The blue line represents the situation when the original and hijacking tasks are swapped. We compared ResNet18 and ResNet34.
  • Figure 6: ER values of CAMH attack, The Chameleon attack, and The Adverse Chameleon attack.
  • Figure : (a)
  • ...and 16 more figures