Table of Contents
Fetching ...

HDLFORGE: A Two-Stage Multi-Agent Framework for Efficient Verilog Code Generation with Adaptive Model Escalation

Armin Abdollahi, Saeid Shokoufa, Negin Ashrafi, Mehdi Kamal, Massoud Pedram

TL;DR

Highlights include a counterexample-guided formal agent that converts bounded-model-checking traces into reusable micro-tests, significantly reducing bug detection time and repair iterations, and a portable escalation controller that can wrap existing Verilog LLM pipelines without modifying their internals.

Abstract

We present HDLFORGE, a two-stage multi-agent framework for automated Verilog generation that optimizes the trade-off between generation speed and accuracy. The system uses a compact coder with a medium-sized LLM by default (Stage A) and escalates to a stronger coder with an ultra-large LLM (Stage B) only when needed, guided by a calibrated score from inexpensive diagnostics including compilation, lint, and smoke tests. A key innovation is a counterexample-guided formal agent that converts bounded-model-checking traces into reusable micro-tests, significantly reducing bug detection time and repair iterations. The portable escalation controller can wrap existing Verilog LLM pipelines without modifying their internals. Evaluated on VerilogEval Human, VerilogEval V2, and RTLLM benchmarks, HDLFORGE demonstrates improved accuracy-latency trade-offs compared to single-stage systems through comprehensive analysis of wall-clock time distributions, escalation thresholds, and agent ablations. On VerilogEval Human and VerilogEval V2, HDLFORGE-Qwen achieves 91.2% and 91.8% Pass@1 with roughly 50% lower median latency, dramatically improving accuracy over other medium-sized models, and 97.2% Pass@5 on RTLLM.

HDLFORGE: A Two-Stage Multi-Agent Framework for Efficient Verilog Code Generation with Adaptive Model Escalation

TL;DR

Highlights include a counterexample-guided formal agent that converts bounded-model-checking traces into reusable micro-tests, significantly reducing bug detection time and repair iterations, and a portable escalation controller that can wrap existing Verilog LLM pipelines without modifying their internals.

Abstract

We present HDLFORGE, a two-stage multi-agent framework for automated Verilog generation that optimizes the trade-off between generation speed and accuracy. The system uses a compact coder with a medium-sized LLM by default (Stage A) and escalates to a stronger coder with an ultra-large LLM (Stage B) only when needed, guided by a calibrated score from inexpensive diagnostics including compilation, lint, and smoke tests. A key innovation is a counterexample-guided formal agent that converts bounded-model-checking traces into reusable micro-tests, significantly reducing bug detection time and repair iterations. The portable escalation controller can wrap existing Verilog LLM pipelines without modifying their internals. Evaluated on VerilogEval Human, VerilogEval V2, and RTLLM benchmarks, HDLFORGE demonstrates improved accuracy-latency trade-offs compared to single-stage systems through comprehensive analysis of wall-clock time distributions, escalation thresholds, and agent ablations. On VerilogEval Human and VerilogEval V2, HDLFORGE-Qwen achieves 91.2% and 91.8% Pass@1 with roughly 50% lower median latency, dramatically improving accuracy over other medium-sized models, and 97.2% Pass@5 on RTLLM.
Paper Structure (24 sections, 12 equations, 2 figures, 8 tables)

This paper contains 24 sections, 12 equations, 2 figures, 8 tables.

Figures (2)

  • Figure 1: HDLFORGE two-stage cascade: Stage A iterates generate--judge--repair with micro-tests, and escalates once to Stage B when progress stalls; both stages are validated on the official testbench.
  • Figure 2: Time to pass summary for Stage A Stage B and total. Bars show median mean p90 and p95 on VerilogEval