Table of Contents
Fetching ...

VeriBug: An Attention-based Framework for Bug-Localization in Hardware Designs

Giuseppe Stracquadanio, Sourav Medya, Stefano Quer, Debjit Pal

TL;DR

VeriBug addresses the time-intensive problem of bug localization in complex hardware designs by learning RTL execution semantics directly from simulation traces. It treats localization as a data-driven task using an attention-based LSTM architecture over AST-context-embedded operands, producing heatmaps that explain and localize root causes at the RTL level. The method leverages GoldMine-derived CDFG/VDG artifacts to generalize to unseen designs without labeled corpora, and demonstrates strong performance with an average bug-coverage of 82.5% across real designs and injected bugs, along with qualitative heatmap visualizations for exact root-cause localization. This approach offers automated, explainable debugging support that integrates with existing verification workflows, enhancing efficiency in hardware debugging.

Abstract

In recent years, there has been an exponential growth in the size and complexity of System-on-Chip designs targeting different specialized applications. The cost of an undetected bug in these systems is much higher than in traditional processor systems as it may imply the loss of property or life. The problem is further exacerbated by the ever-shrinking time-to-market and ever-increasing demand to churn out billions of devices. Despite decades of research in simulation and formal methods for debugging and verification, it is still one of the most time-consuming and resource intensive processes in contemporary hardware design cycle. In this work, we propose VeriBug, which leverages recent advances in deep learning to accelerate debugging at the Register-Transfer Level and generates explanations of likely root causes. First, VeriBug uses control-data flow graph of a hardware design and learns to execute design statements by analyzing the context of operands and their assignments. Then, it assigns an importance score to each operand in a design statement and uses that score for generating explanations for failures. Finally, VeriBug produces a heatmap highlighting potential buggy source code portions. Our experiments show that VeriBug can achieve an average bug localization coverage of 82.5% on open-source designs and different types of injected bugs.

VeriBug: An Attention-based Framework for Bug-Localization in Hardware Designs

TL;DR

VeriBug addresses the time-intensive problem of bug localization in complex hardware designs by learning RTL execution semantics directly from simulation traces. It treats localization as a data-driven task using an attention-based LSTM architecture over AST-context-embedded operands, producing heatmaps that explain and localize root causes at the RTL level. The method leverages GoldMine-derived CDFG/VDG artifacts to generalize to unseen designs without labeled corpora, and demonstrates strong performance with an average bug-coverage of 82.5% across real designs and injected bugs, along with qualitative heatmap visualizations for exact root-cause localization. This approach offers automated, explainable debugging support that integrates with existing verification workflows, enhancing efficiency in hardware debugging.

Abstract

In recent years, there has been an exponential growth in the size and complexity of System-on-Chip designs targeting different specialized applications. The cost of an undetected bug in these systems is much higher than in traditional processor systems as it may imply the loss of property or life. The problem is further exacerbated by the ever-shrinking time-to-market and ever-increasing demand to churn out billions of devices. Despite decades of research in simulation and formal methods for debugging and verification, it is still one of the most time-consuming and resource intensive processes in contemporary hardware design cycle. In this work, we propose VeriBug, which leverages recent advances in deep learning to accelerate debugging at the Register-Transfer Level and generates explanations of likely root causes. First, VeriBug uses control-data flow graph of a hardware design and learns to execute design statements by analyzing the context of operands and their assignments. Then, it assigns an importance score to each operand in a design statement and uses that score for generating explanations for failures. Finally, VeriBug produces a heatmap highlighting potential buggy source code portions. Our experiments show that VeriBug can achieve an average bug localization coverage of 82.5% on open-source designs and different types of injected bugs.
Paper Structure (13 sections, 3 equations, 4 figures, 3 tables)

This paper contains 13 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: $\hbox{VeriBug}$ workflow. (1) The Feature Extraction component extracts features from the dynamic analysis of the design. (2) The Deep-Learning Model learns execution semantics using these features to predict target values. (3) The Explainer component aggregates trace-level semantics into condensed execution information, producing final heatmap $\mathbb{H}_t.$
  • Figure 2: Feature Extraction Module. (1) A dependence analysis produces the complete set $\mathbf{Dep}_t$ of control and data dependencies for the target output variable $\mathbf{t}$. (2) The slicing criterion uses the extracted set of dependencies $\mathbf{Dep}_t$ and the input vector $\mathbf{I}_n$ ($\hbox{\em e.g.}\ \{req1 = 0, req2 = 1, ...\}$) to dynamically slice the input design $\mathbf{D}$. (3) Context extraction is finally achieved for the operands of each slice statement, after encoding them to Abstract Syntax Trees (ASTs) In the figure, the context of $req_1$ is the list of paths $\{[And, Rvalue, BlockingAssignment,Lvalue],[And,Not]\}$.
  • Figure 3: The Deep Learning Model of $\hbox{VeriBug}$. (1) For each statement $l_k$ and its AST, the model embeds operand contexts and encode their assignments to concatenate them in operand embeddings. (2) A weighted sum is computed with operand embeddings, using attention weights produced by our attention module. (3) A final prediction for $l_k$ output value is made from a statement-level embedding produced by weighted sum.
  • Figure 4: $\hbox{VeriBug}$ qualitative results on realistic designs: examples of $\hbox{VeriBug}$ generated heatmaps. For comparison, we also report the operand importance scores (blue, deeper is more important) in $\mathbb{C}_t$, against which important scores (red, deeper is more important) in $\mathbb{F}_t$ are compared. The rightmost column reports the suspiciousness score for the statement $l_{bug}$ ($\hbox{\em i.e.}$, the statement with the root cause). Note that $\mathbb{H}_t(l_{bug}) = \mathbb{F}_t(l_{bug})$ when suspiciousness score for $l_{bug}$ is higher than threshold.