Table of Contents
Fetching ...

TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans

Md Omar Faruque, Peter Jamieson, Ahmad Patooghy, Abdel-Hameed A. Badawy

TL;DR

TrojanWhisper investigates using general-purpose LLMs to detect and localize Hardware Trojans in RTL designs without fine-tuning. The approach combines an HT signature generation engine, a perturbation engine to stress-test models, and an LLM detection engine that leverages in-context learning, evaluated on 14 HT-bearing benchmarks across SRAM, UART, and AES designs. Results show Gemini 1.5 pro achieving perfect detection under baseline and obfuscation scenarios, with localization and HT-type classification proving more challenging, while localization performance degrades under perturbations. The study demonstrates the promise of LLM-based HT detection as a complementary tool to existing methods, while highlighting limitations in scalability, determinism, and precise localization that warrant further research.

Abstract

Existing Hardware Trojans (HT) detection methods face several critical limitations: logic testing struggles with scalability and coverage for large designs, side-channel analysis requires golden reference chips, and formal verification methods suffer from state-space explosion. The emergence of Large Language Models (LLMs) offers a promising new direction for HT detection by leveraging their natural language understanding and reasoning capabilities. For the first time, this paper explores the potential of general-purpose LLMs in detecting various HTs inserted in Register Transfer Level (RTL) designs, including SRAM, AES, and UART modules. We propose a novel tool for this goal that systematically assesses state-of-the-art LLMs (GPT-4o, Gemini 1.5 pro, and Llama 3.1) in detecting HTs without prior fine-tuning. To address potential training data bias, the tool implements perturbation techniques, i.e., variable name obfuscation, and design restructuring, that make the cases more sophisticated for the used LLMs. Our experimental evaluation demonstrates perfect detection rates by GPT-4o and Gemini 1.5 pro in baseline scenarios (100%/100% precision/recall), with both models achieving better trigger line coverage (TLC: 0.82-0.98) than payload line coverage (PLC: 0.32-0.46). Under code perturbation, while Gemini 1.5 pro maintains perfect detection performance (100%/100%), GPT-4o (100%/85.7%) and Llama 3.1 (66.7%/85.7%) show some degradation in detection rates, and all models experience decreased accuracy in localizing both triggers and payloads. This paper validates the potential of LLM approaches for hardware security applications, highlighting areas for future improvement.

TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans

TL;DR

TrojanWhisper investigates using general-purpose LLMs to detect and localize Hardware Trojans in RTL designs without fine-tuning. The approach combines an HT signature generation engine, a perturbation engine to stress-test models, and an LLM detection engine that leverages in-context learning, evaluated on 14 HT-bearing benchmarks across SRAM, UART, and AES designs. Results show Gemini 1.5 pro achieving perfect detection under baseline and obfuscation scenarios, with localization and HT-type classification proving more challenging, while localization performance degrades under perturbations. The study demonstrates the promise of LLM-based HT detection as a complementary tool to existing methods, while highlighting limitations in scalability, determinism, and precise localization that warrant further research.

Abstract

Existing Hardware Trojans (HT) detection methods face several critical limitations: logic testing struggles with scalability and coverage for large designs, side-channel analysis requires golden reference chips, and formal verification methods suffer from state-space explosion. The emergence of Large Language Models (LLMs) offers a promising new direction for HT detection by leveraging their natural language understanding and reasoning capabilities. For the first time, this paper explores the potential of general-purpose LLMs in detecting various HTs inserted in Register Transfer Level (RTL) designs, including SRAM, AES, and UART modules. We propose a novel tool for this goal that systematically assesses state-of-the-art LLMs (GPT-4o, Gemini 1.5 pro, and Llama 3.1) in detecting HTs without prior fine-tuning. To address potential training data bias, the tool implements perturbation techniques, i.e., variable name obfuscation, and design restructuring, that make the cases more sophisticated for the used LLMs. Our experimental evaluation demonstrates perfect detection rates by GPT-4o and Gemini 1.5 pro in baseline scenarios (100%/100% precision/recall), with both models achieving better trigger line coverage (TLC: 0.82-0.98) than payload line coverage (PLC: 0.32-0.46). Under code perturbation, while Gemini 1.5 pro maintains perfect detection performance (100%/100%), GPT-4o (100%/85.7%) and Llama 3.1 (66.7%/85.7%) show some degradation in detection rates, and all models experience decreased accuracy in localizing both triggers and payloads. This paper validates the potential of LLM approaches for hardware security applications, highlighting areas for future improvement.

Paper Structure

This paper contains 15 sections, 1 figure, 6 tables.

Figures (1)

  • Figure 1: TrojanWhisper detection of an information leakage HT in SRAM design (SRAM-T110 tjbench): HT signatures (a), original HT (SRAM-T110) implementation (b), obfuscated HT (c), and detection results (d).