SALAD: Systematic Assessment of Machine Unlearning on LLM-Aided Hardware Design
Zeng Wang, Minghao Shao, Rupesh Karn, Likhitha Mankali, Jitendra Bhandari, Ramesh Karri, Ozgur Sinanoglu, Muhammad Shafique, Johann Knechtel
TL;DR
The paper presents SALAD, a framework for systematically applying machine unlearning to LLMs used in Verilog generation to counter four data-security threats: benchmark contamination, custom-design misuse, IP leakage, and malicious code. It evaluates six unlearning methods (GA, GD, PO, NPO, SimNPO, RMU) across four industrial case studies, using concrete RTL benchmarks and metrics such as FR, Min-K%++, PrivLeak, and Pass@K. The study finds that RMU and SimNPO offer the best balance between forgetting sensitive data and maintaining RTL utility, typically converging within 2–3 unlearning epochs, while GA and GD achieve stronger forgetting at the expense of downstream performance. These results provide a practical path toward secure, trustworthy LLM-aided hardware design while highlighting trade-offs between aggression of forgetting and preservation of design quality.
Abstract
Large Language Models (LLMs) offer transformative capabilities for hardware design automation, particularly in Verilog code generation. However, they also pose significant data security challenges, including Verilog evaluation data contamination, intellectual property (IP) design leakage, and the risk of malicious Verilog generation. We introduce SALAD, a comprehensive assessment that leverages machine unlearning to mitigate these threats. Our approach enables the selective removal of contaminated benchmarks, sensitive IP and design artifacts, or malicious code patterns from pre-trained LLMs, all without requiring full retraining. Through detailed case studies, we demonstrate how machine unlearning techniques effectively reduce data security risks in LLM-aided hardware design.
