Table of Contents
Fetching ...

SALTY: Explainable Artificial Intelligence Guided Structural Analysis for Hardware Trojan Detection

Tanzim Mahfuz, Pravin Gaikwad, Tasneem Suha, Swarup Bhunia, Prabuddha Chakraborty

TL;DR

SALTY tackles hardware Trojan detection in distributed semiconductor supply chains by combining a Jumping-Knowledge-enabled Graph Attention Network with an explainable AI post-processing module. It constructs a wire-graph representation, extracts local structural features, and uses JK-GAT to produce robust node embeddings that generalize to unseen designs. Explainability via Captum Integrated Gradients guides a dynamic post-processing step that reduces AI hallucinations, leading to high $TPR$ and $TNR$ (e.g., $TPR=98.47\%$, $TNR=98.14\%$) across >15 benchmarks and outperforming seven state-of-the-art methods. The approach also yields human-readable rules that illuminate the detection logic, enhancing trust and practical applicability in hardware security workflows.

Abstract

Hardware Trojans are malicious modifications in digital designs that can be inserted by untrusted supply chain entities. Hardware Trojans can give rise to diverse attack vectors such as information leakage (e.g. MOLES Trojan) and denial-of-service (rarely triggered bit flip). Such an attack in critical systems (e.g. healthcare and aviation) can endanger human lives and lead to catastrophic financial loss. Several techniques have been developed to detect such malicious modifications in digital designs, particularly for designs sourced from third-party intellectual property (IP) vendors. However, most techniques have scalability concerns (due to unsound assumptions during evaluation) and lead to large number of false positive detections (false alerts). Our framework (SALTY) mitigates these concerns through the use of a novel Graph Neural Network architecture (using Jumping-Knowledge mechanism) for generating initial predictions and an Explainable Artificial Intelligence (XAI) approach for fine tuning the outcomes (post-processing). Experiments show 98% True Positive Rate (TPR) and True Negative Rate (TNR), significantly outperforming state-of-the-art techniques across a large set of standard benchmarks.

SALTY: Explainable Artificial Intelligence Guided Structural Analysis for Hardware Trojan Detection

TL;DR

SALTY tackles hardware Trojan detection in distributed semiconductor supply chains by combining a Jumping-Knowledge-enabled Graph Attention Network with an explainable AI post-processing module. It constructs a wire-graph representation, extracts local structural features, and uses JK-GAT to produce robust node embeddings that generalize to unseen designs. Explainability via Captum Integrated Gradients guides a dynamic post-processing step that reduces AI hallucinations, leading to high and (e.g., , ) across >15 benchmarks and outperforming seven state-of-the-art methods. The approach also yields human-readable rules that illuminate the detection logic, enhancing trust and practical applicability in hardware security workflows.

Abstract

Hardware Trojans are malicious modifications in digital designs that can be inserted by untrusted supply chain entities. Hardware Trojans can give rise to diverse attack vectors such as information leakage (e.g. MOLES Trojan) and denial-of-service (rarely triggered bit flip). Such an attack in critical systems (e.g. healthcare and aviation) can endanger human lives and lead to catastrophic financial loss. Several techniques have been developed to detect such malicious modifications in digital designs, particularly for designs sourced from third-party intellectual property (IP) vendors. However, most techniques have scalability concerns (due to unsound assumptions during evaluation) and lead to large number of false positive detections (false alerts). Our framework (SALTY) mitigates these concerns through the use of a novel Graph Neural Network architecture (using Jumping-Knowledge mechanism) for generating initial predictions and an Explainable Artificial Intelligence (XAI) approach for fine tuning the outcomes (post-processing). Experiments show 98% True Positive Rate (TPR) and True Negative Rate (TNR), significantly outperforming state-of-the-art techniques across a large set of standard benchmarks.

Paper Structure

This paper contains 16 sections, 5 equations, 3 figures, 4 tables, 2 algorithms.

Figures (3)

  • Figure 1: The SALTY Framework: input/output, feature extraction, graph neural network, and post-processing.
  • Figure 2: Illustration of SALTY model's decision-making process for Trojan nodes. Two distinct benchmarks are shown here.
  • Figure 3: A: Comparison of feature scores between Trojan and non-Trojan nodes - Trojan nodes exhibit significantly higher feature scores compared to non-Trojan nodes; B: Analysis of feature scores for non-Trojan nodes - Certain non-Trojan nodes generate high feature values, indicating false negatives in the classification.