Automated Hardware Trojan Insertion in Industrial-Scale Designs
Yaroslav Popryho, Debjit Pal, Inna Partin-Vaisband
TL;DR
Addresses the challenge of evaluating hardware Trojan detectors on industrial-scale SoCs by automating Trojan-like insertions that preserve I/O. Proposes a pipeline that builds connectivity graphs, mines rare regions with SCOAP metrics, and applies function-preserving graph rewrites guided by a learned policy and LLM-generated template library. Produces Trojanized netlists with per-net and per-cone labels for reproducible, large-scale detector evaluation under extreme class imbalance (Trojan-labeled nets typically $\ll 0.1\%$). Demonstrates that state-of-the-art graph-based detectors trained on small benchmarks struggle to detect unseen Trojans in big designs, highlighting a need for hierarchical, testability-aware models and broader template diversity. Provides a reproducible benchmark framework bridging the gap between academic benchmarks and production-scale SoCs.
Abstract
Industrial Systems-on-Chips (SoCs) often comprise hundreds of thousands to millions of nets and millions to tens of millions of connectivity edges, making empirical evaluation of hardware-Trojan (HT) detectors on realistic designs both necessary and difficult. Public benchmarks remain significantly smaller and hand-crafted, while releasing truly malicious RTL raises ethical and operational risks. This work presents an automated and scalable methodology for generating HT-like patterns in industry-scale netlists whose purpose is to stress-test detection tools without altering user-visible functionality. The pipeline (i) parses large gate-level designs into connectivity graphs, (ii) explores rare regions using SCOAP testability metrics, and (iii) applies parameterized, function-preserving graph transformations to synthesize trigger-payload pairs that mimic the statistical footprint of stealthy HTs. When evaluated on the benchmarks generated in this work, representative state-of-the-art graph-learning models fail to detect Trojans. The framework closes the evaluation gap between academic circuits and modern SoCs by providing reproducible challenge instances that advance security research without sharing step-by-step attack instructions.
