Table of Contents
Fetching ...

A White-Box Adversarial Attack Against a Digital Twin

Wilson Patterson, Ivan Fernandez, Subash Neupane, Milan Parmar, Sudip Mittal, Shahram Rahimi

TL;DR

The paper investigates the vulnerability of Digital Twins in cyber-physical systems to adversarial perturbations under a white-box setting. It implements a vehicular DT with a deep learning classifier $f(x)$ over 15 sensor channels and evaluates resistance to adversarial inputs by perturbing $x$ to $x' = x + \delta x$ with Gaussian noise drawn from $\mathcal{N}(0,\sigma^2)$. Results show that a true anomaly can be misclassified as normal, evidenced by Mahalanobis distances changing from $MD=5.79$ (normal) to $MD=9.72$ (perturbed anomaly), highlighting a security risk in DT-based CPS. The work underscores the need for robust evaluation and defense strategies, and points to future directions involving stronger attacks such as FGSM and defense validation via ART.

Abstract

Recent research has shown that Machine Learning/Deep Learning (ML/DL) models are particularly vulnerable to adversarial perturbations, which are small changes made to the input data in order to fool a machine learning classifier. The Digital Twin, which is typically described as consisting of a physical entity, a virtual counterpart, and the data connections in between, is increasingly being investigated as a means of improving the performance of physical entities by leveraging computational techniques, which are enabled by the virtual counterpart. This paper explores the susceptibility of Digital Twin (DT), a virtual model designed to accurately reflect a physical object using ML/DL classifiers that operate as Cyber Physical Systems (CPS), to adversarial attacks. As a proof of concept, we first formulate a DT of a vehicular system using a deep neural network architecture and then utilize it to launch an adversarial attack. We attack the DT model by perturbing the input to the trained model and show how easily the model can be broken with white-box attacks.

A White-Box Adversarial Attack Against a Digital Twin

TL;DR

The paper investigates the vulnerability of Digital Twins in cyber-physical systems to adversarial perturbations under a white-box setting. It implements a vehicular DT with a deep learning classifier over 15 sensor channels and evaluates resistance to adversarial inputs by perturbing to with Gaussian noise drawn from . Results show that a true anomaly can be misclassified as normal, evidenced by Mahalanobis distances changing from (normal) to (perturbed anomaly), highlighting a security risk in DT-based CPS. The work underscores the need for robust evaluation and defense strategies, and points to future directions involving stronger attacks such as FGSM and defense validation via ART.

Abstract

Recent research has shown that Machine Learning/Deep Learning (ML/DL) models are particularly vulnerable to adversarial perturbations, which are small changes made to the input data in order to fool a machine learning classifier. The Digital Twin, which is typically described as consisting of a physical entity, a virtual counterpart, and the data connections in between, is increasingly being investigated as a means of improving the performance of physical entities by leveraging computational techniques, which are enabled by the virtual counterpart. This paper explores the susceptibility of Digital Twin (DT), a virtual model designed to accurately reflect a physical object using ML/DL classifiers that operate as Cyber Physical Systems (CPS), to adversarial attacks. As a proof of concept, we first formulate a DT of a vehicular system using a deep neural network architecture and then utilize it to launch an adversarial attack. We attack the DT model by perturbing the input to the trained model and show how easily the model can be broken with white-box attacks.
Paper Structure (3 sections, 3 figures)

This paper contains 3 sections, 3 figures.

Figures (3)

  • Figure 1: Adversarial attack architecture against a trained digital twin.
  • Figure 2: Normal input sequence. The ADS predicts $NORMAL$ with a Mahalanobis distance of 5.79.
  • Figure 3: Perturbed input sequence. The ADS predicts $ANOMALY$ with a Mahalanobis distance of 9.72.