Table of Contents
Fetching ...

SAND: A Self-supervised and Adaptive NAS-Driven Framework for Hardware Trojan Detection

Zhixin Pan, Ziyu Shu, Linh Nguyen, Amberbir Alemayoh

TL;DR

This paper tackles hardware Trojan detection in the global semiconductor supply chain by introducing SAND, a framework that combines self-supervised learning (SSL) for automated feature embedding with neural architecture search (NAS) to adapt the downstream classifier to new benchmarks with minimal retraining. The upstream encoder operates on graph representations of circuits using a Graph Convolutional Network, optimized with a hybrid contrastive loss that integrates positive, negative, and global clustering objectives. A SHAP-based pruning step in the NAS phase yields a compact, task-specific classifier, enabling strong adaptability to unseen HT variants while maintaining stability across deployments. Experimental results show that SAND achieves up to a significant improvement in detection accuracy over state-of-the-art methods, demonstrates resilience against evasive Trojans, and generalizes well across diverse benchmarks. Overall, SAND offers a scalable, adaptive, and robust HT detection solution suitable for real-world SoC security challenges.

Abstract

The globalized semiconductor supply chain has made Hardware Trojans (HT) a significant security threat to embedded systems, necessitating the design of efficient and adaptable detection mechanisms. Despite promising machine learning-based HT detection techniques in the literature, they suffer from ad hoc feature selection and the lack of adaptivity, all of which hinder their effectiveness across diverse HT attacks. In this paper, we propose SAND, a selfsupervised and adaptive NAS-driven framework for efficient HT detection. Specifically, this paper makes three key contributions. (1) We leverage self-supervised learning (SSL) to enable automated feature extraction, eliminating the dependency on manually engineered features. (2) SAND integrates neural architecture search (NAS) to dynamically optimize the downstream classifier, allowing for seamless adaptation to unseen benchmarks with minimal fine-tuning. (3) Experimental results show that SAND achieves a significant improvement in detection accuracy (up to 18.3%) over state-of-the-art methods, exhibits high resilience against evasive Trojans, and demonstrates strong generalization.

SAND: A Self-supervised and Adaptive NAS-Driven Framework for Hardware Trojan Detection

TL;DR

This paper tackles hardware Trojan detection in the global semiconductor supply chain by introducing SAND, a framework that combines self-supervised learning (SSL) for automated feature embedding with neural architecture search (NAS) to adapt the downstream classifier to new benchmarks with minimal retraining. The upstream encoder operates on graph representations of circuits using a Graph Convolutional Network, optimized with a hybrid contrastive loss that integrates positive, negative, and global clustering objectives. A SHAP-based pruning step in the NAS phase yields a compact, task-specific classifier, enabling strong adaptability to unseen HT variants while maintaining stability across deployments. Experimental results show that SAND achieves up to a significant improvement in detection accuracy over state-of-the-art methods, demonstrates resilience against evasive Trojans, and generalizes well across diverse benchmarks. Overall, SAND offers a scalable, adaptive, and robust HT detection solution suitable for real-world SoC security challenges.

Abstract

The globalized semiconductor supply chain has made Hardware Trojans (HT) a significant security threat to embedded systems, necessitating the design of efficient and adaptable detection mechanisms. Despite promising machine learning-based HT detection techniques in the literature, they suffer from ad hoc feature selection and the lack of adaptivity, all of which hinder their effectiveness across diverse HT attacks. In this paper, we propose SAND, a selfsupervised and adaptive NAS-driven framework for efficient HT detection. Specifically, this paper makes three key contributions. (1) We leverage self-supervised learning (SSL) to enable automated feature extraction, eliminating the dependency on manually engineered features. (2) SAND integrates neural architecture search (NAS) to dynamically optimize the downstream classifier, allowing for seamless adaptation to unseen benchmarks with minimal fine-tuning. (3) Experimental results show that SAND achieves a significant improvement in detection accuracy (up to 18.3%) over state-of-the-art methods, exhibits high resilience against evasive Trojans, and demonstrates strong generalization.

Paper Structure

This paper contains 29 sections, 4 equations, 8 figures, 2 tables, 1 algorithm.

Figures (8)

  • Figure 1: Illustration of contrastive learning. Given an anchor input $I^a$, a positive example $I^+$ is generated through data augmentation, while a negative example $I^-$ is selected from a different class. The model learns a feature representation such that the distance $\delta(I^a, I^+)$ is minimized, while the distance $\delta(I^a, I^-)$ is maximized.
  • Figure 2: The overview framework of SAND. The main module consists of an upstream encoder, implemented through self-supervised learning (SSL) and a downstream classifier crafted by neural architecture search (NAS).
  • Figure 3: Illustration of global clustering. Two benign circuits ($C_1$, $C_2$) and their associated positive/negative samples are shown. (a) Without global structure, contrastive loss ($\mathcal{L}_P$, $\mathcal{L}_N$) may succeed if $C_1$ and $C_2$ are close, (b) but fails when they are far apart. (c) Global clustering loss pulls samples toward their class centroid ($\mu_k$), (d) enhancing inter-class separation and reducing intra-class variance.
  • Figure 4: The workflow of SHAP-based NAS for optimizing the downstream classifier architecture. The process begins with an over-parameterized SuperNet, components with low SHAP values are progressively pruned (faded).
  • Figure 5: 2D feature embeddings learned by SAND. (a) Epoch 0: Features are randomly scattered. (b) Epoch 50: Early clustering with some category overlap. (c) Epoch 150: Clear, well-separated clusters. (d) Without global clustering loss: fragmented intra-class clusters lead to intermixing.
  • ...and 3 more figures