Table of Contents
Fetching ...

Zero-Trust Artificial Intelligence Model Security Based on Moving Target Defense and Content Disarm and Reconstruction

Daniel Gilkarov, Ran Dubin

TL;DR

This work addresses the security of AI artifact distribution by tackling file-based serialization threats in formats like Pickle and PyTorch. It introduces a zero-trust architecture combining Content Disarm and Reconstruction (CDR) via a safe-unpickler with Moving Target Defense (MTD) for model weights, and demonstrates effectiveness with high disarmability on diverse datasets. The approach integrates policy-driven allowed/blocked function calls, post-CDR verification with Fickling, and a training-and-deployment workflow within secure enclaves, plus cloud-backed verification and mapping. The results offer a practical pathway to curb supply-chain and serialization attacks during AI artifact deployment, with potential extension to additional formats and threat models.

Abstract

This paper examines the challenges in distributing AI models through model zoos and file transfer mechanisms. Despite advancements in security measures, vulnerabilities persist, necessitating a multi-layered approach to mitigate risks effectively. The physical security of model files is critical, requiring stringent access controls and attack prevention solutions. This paper proposes a novel solution architecture composed of two prevention approaches. The first is Content Disarm and Reconstruction (CDR), which focuses on disarming serialization attacks that enable attackers to run malicious code as soon as the model is loaded. The second is protecting the model architecture and weights from attacks by using Moving Target Defense (MTD), alerting the model structure, and providing verification steps to detect such attacks. The paper focuses on the highly exploitable Pickle and PyTorch file formats. It demonstrates a 100% disarm rate while validated against known AI model repositories and actual malware attacks from the HuggingFace model zoo.

Zero-Trust Artificial Intelligence Model Security Based on Moving Target Defense and Content Disarm and Reconstruction

TL;DR

This work addresses the security of AI artifact distribution by tackling file-based serialization threats in formats like Pickle and PyTorch. It introduces a zero-trust architecture combining Content Disarm and Reconstruction (CDR) via a safe-unpickler with Moving Target Defense (MTD) for model weights, and demonstrates effectiveness with high disarmability on diverse datasets. The approach integrates policy-driven allowed/blocked function calls, post-CDR verification with Fickling, and a training-and-deployment workflow within secure enclaves, plus cloud-backed verification and mapping. The results offer a practical pathway to curb supply-chain and serialization attacks during AI artifact deployment, with potential extension to additional formats and threat models.

Abstract

This paper examines the challenges in distributing AI models through model zoos and file transfer mechanisms. Despite advancements in security measures, vulnerabilities persist, necessitating a multi-layered approach to mitigate risks effectively. The physical security of model files is critical, requiring stringent access controls and attack prevention solutions. This paper proposes a novel solution architecture composed of two prevention approaches. The first is Content Disarm and Reconstruction (CDR), which focuses on disarming serialization attacks that enable attackers to run malicious code as soon as the model is loaded. The second is protecting the model architecture and weights from attacks by using Moving Target Defense (MTD), alerting the model structure, and providing verification steps to detect such attacks. The paper focuses on the highly exploitable Pickle and PyTorch file formats. It demonstrates a 100% disarm rate while validated against known AI model repositories and actual malware attacks from the HuggingFace model zoo.

Paper Structure

This paper contains 19 sections, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Simplified AI model file architecture
  • Figure 2: Visualization of Pickle model remote code execution. Step (1) illustrates how easy it is to create the attack, Step (2) is the hex view, Step (3) visualizes how Fickling prints the attack steps, and Step (4) is the execution of the ransomware.
  • Figure 3: AI model CDR software package Architecture designed to eliminate serialization attacks.
  • Figure 4: MTD model creation inside a secure enclave
  • Figure 5: MTD model loading and fallback to CDR
  • ...and 2 more figures