An AI Architecture with the Capability to Classify and Explain Hardware Trojans
Paul Whitten, Francis Wolff, Chris Papachristou
TL;DR
This work tackles the lack of explainability in ML-based hardware Trojan detection by introducing two explainable architectures that operate on five gate-level features. The property-based approach uses 31 feature-property combinations with SVM inference engines and a knowledge-base to vote on classifications, while the case-based approach employs a training index and KNN-style explanations tied to an SVM classifier. Experiments on trust-hub netlists show that static weighting with a balance factor performs comparably to dynamic weighting, but the property-based explanations offer limited interpretability compared to the case-based method, which achieves high correspondence with decisions and provides concrete neighbor references. The findings suggest that case-based explanations enhance trust and contextual understanding in hardware Trojan detection, with potential scalability improvements for larger datasets.
Abstract
Hardware trojan detection methods, based on machine learning (ML) techniques, mainly identify suspected circuits but lack the ability to explain how the decision was arrived at. An explainable methodology and architecture is introduced based on the existing hardware trojan detection features. Results are provided for explaining digital hardware trojans within a netlist using trust-hub trojan benchmarks.
