Table of Contents
Fetching ...

DeepiSign-G: Generic Watermark to Stamp Hidden DNN Parameters for Self-contained Tracking

Alsharif Abuadbba, Nicholas Rhodes, Kristen Moore, Bushra Sabir, Shuo Wang, Yansong Gao

TL;DR

DeepiSign-G introduces a generic, fragile watermark embedded in DNN parameters using the Fast Walsh-Hadamard Transform to detect any unauthorized modifications without impairing model performance. By distributing hidden bits across transform coefficients and employing key-based randomization, the method supports self-contained metadata tracking and integrity verification for both CNNs and RNNs. Extensive experiments on face recognition, traffic sign, CIFAR-10, and IMDB sentiment tasks against multiple attack types demonstrate near-perfect integrity breach detection while maintaining accuracy on clean data. The approach offers scalable embedding capacity, architecture-agnostic applicability, and practical security benefits for high-stakes DL deployments.

Abstract

Deep learning solutions in critical domains like autonomous vehicles, facial recognition, and sentiment analysis require caution due to the severe consequences of errors. Research shows these models are vulnerable to adversarial attacks, such as data poisoning and neural trojaning, which can covertly manipulate model behavior, compromising reliability and safety. Current defense strategies like watermarking have limitations: they fail to detect all model modifications and primarily focus on attacks on CNNs in the image domain, neglecting other critical architectures like RNNs. To address these gaps, we introduce DeepiSign-G, a versatile watermarking approach designed for comprehensive verification of leading DNN architectures, including CNNs and RNNs. DeepiSign-G enhances model security by embedding an invisible watermark within the Walsh-Hadamard transform coefficients of the model's parameters. This watermark is highly sensitive and fragile, ensuring prompt detection of any modifications. Unlike traditional hashing techniques, DeepiSign-G allows substantial metadata incorporation directly within the model, enabling detailed, self-contained tracking and verification. We demonstrate DeepiSign-G's applicability across various architectures, including CNN models (VGG, ResNets, DenseNet) and RNNs (Text sentiment classifier). We experiment with four popular datasets: VGG Face, CIFAR10, GTSRB Traffic Sign, and Large Movie Review. We also evaluate DeepiSign-G under five potential attacks. Our comprehensive evaluation confirms that DeepiSign-G effectively detects these attacks without compromising CNN and RNN model performance, highlighting its efficacy as a robust security measure for deep learning applications. Detection of integrity breaches is nearly perfect, while hiding only a bit in approximately 1% of the Walsh-Hadamard coefficients.

DeepiSign-G: Generic Watermark to Stamp Hidden DNN Parameters for Self-contained Tracking

TL;DR

DeepiSign-G introduces a generic, fragile watermark embedded in DNN parameters using the Fast Walsh-Hadamard Transform to detect any unauthorized modifications without impairing model performance. By distributing hidden bits across transform coefficients and employing key-based randomization, the method supports self-contained metadata tracking and integrity verification for both CNNs and RNNs. Extensive experiments on face recognition, traffic sign, CIFAR-10, and IMDB sentiment tasks against multiple attack types demonstrate near-perfect integrity breach detection while maintaining accuracy on clean data. The approach offers scalable embedding capacity, architecture-agnostic applicability, and practical security benefits for high-stakes DL deployments.

Abstract

Deep learning solutions in critical domains like autonomous vehicles, facial recognition, and sentiment analysis require caution due to the severe consequences of errors. Research shows these models are vulnerable to adversarial attacks, such as data poisoning and neural trojaning, which can covertly manipulate model behavior, compromising reliability and safety. Current defense strategies like watermarking have limitations: they fail to detect all model modifications and primarily focus on attacks on CNNs in the image domain, neglecting other critical architectures like RNNs. To address these gaps, we introduce DeepiSign-G, a versatile watermarking approach designed for comprehensive verification of leading DNN architectures, including CNNs and RNNs. DeepiSign-G enhances model security by embedding an invisible watermark within the Walsh-Hadamard transform coefficients of the model's parameters. This watermark is highly sensitive and fragile, ensuring prompt detection of any modifications. Unlike traditional hashing techniques, DeepiSign-G allows substantial metadata incorporation directly within the model, enabling detailed, self-contained tracking and verification. We demonstrate DeepiSign-G's applicability across various architectures, including CNN models (VGG, ResNets, DenseNet) and RNNs (Text sentiment classifier). We experiment with four popular datasets: VGG Face, CIFAR10, GTSRB Traffic Sign, and Large Movie Review. We also evaluate DeepiSign-G under five potential attacks. Our comprehensive evaluation confirms that DeepiSign-G effectively detects these attacks without compromising CNN and RNN model performance, highlighting its efficacy as a robust security measure for deep learning applications. Detection of integrity breaches is nearly perfect, while hiding only a bit in approximately 1% of the Walsh-Hadamard coefficients.
Paper Structure (40 sections, 2 equations, 6 figures, 5 tables)

This paper contains 40 sections, 2 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: (Top) Plot of first 10K of DNN hidden layers weights. (Middle) Plot of similar 10K weights after flipping number of bits which reflected clearly as distortion. (Bottom) Plot of similar 10K weights after converting them into Walsh-Hadamard frequency space and flipping number of bits which demonstrates little effect.
  • Figure 2: A high-level overview of the DeepiSign-VT embedding, retrieval and verification processes.
  • Figure 3: DeepiSign-VT embedding embedding steps that are explained in Embedding Algorithm C in details.
  • Figure 4: A sample from the original and trojaned version of the VGG Face Dataset vggface, and a sample from the trojaned reverse-engineered dataset crafted by liu2017trojaning.
  • Figure 5: Samples from RNN trojaning paper showing the insertion of a trigger sentence into the review following rnn-backdoor. Notably, the insertion of this neutral trigger sentence does not have any influence on the sentiment. As explained in rnn-backdoor: "Examples of backdoor instances. (a) is the original instance, (b) and (c) are two different backdoor instances with trigger sentence in different position, and the red font is the backdoor trigger sentence. The trigger sentence is semantically correct in the context."
  • ...and 1 more figures