ERASER: Efficient RTL FAult Simulation Framework with Trimmed Execution Redundancy
Jiaping Tang, Jianan Mu, Silin Liu, Zizhen Liu, Feng Gu, Xinyu Zhang, Leyan Wang, Shenwen Liang, Jing Ye, Huawei Li, Xiaowei Li
TL;DR
This paper tackles the high cost of RTL fault simulation by identifying and eliminating implicit execution redundancy in behavioral nodes, which are often masked by traditional input-based checks. It introduces Eraser, a framework that combines runtime execution-path analysis with explicit redundancy detection to comprehensively prune redundant simulations and coordinate efficient fault propagation across RTL graphs. Empirical results show Eraser achieves average speedups of 3.9× against a commercial tool and 5.9× against an open-source baseline while preserving fault coverage, demonstrating practical impact for functional safety verification in automotive and SoC design. The approach meaningfully reduces design-cycle time for RTL-level fault simulation, enabling closer-to-realistic fault coverage assessment without prohibitive computational costs.
Abstract
As intelligent computing devices increasingly integrate into human life, ensuring the functional safety of the corresponding electronic chips becomes more critical. A key metric for functional safety is achieving a sufficient fault coverage. To meet this requirement, extensive time-consuming fault simulation of the RTL code is necessary during the chip design phase.The main overhead in RTL fault simulation comes from simulating behavioral nodes (always blocks). Due to the limited fault propagation capacity, fault simulation results often match the good simulation results for many behavioral nodes. A key strategy for accelerating RTL fault simulation is the identification and elimination of redundant simulations. Existing methods detect redundant executions by examining whether the fault inputs to each RTL node are consistent with the good inputs. However, we observe that this input comparison mechanism overlooks a significant amount of implicit redundant execution: although the fault inputs differ from the good inputs, the node's execution results remain unchanged. Our experiments reveal that this overlooked redundant execution constitutes nearly half of the total execution overhead of behavioral nodes, becoming a significant bottleneck in current RTL fault simulation. The underlying reason for this overlooked redundancy is that, in these cases, the true execution paths within the behavioral nodes are not affected by the changes in input values. In this work, we propose a behavior-level redundancy detection algorithm that focuses on the true execution paths. Building on the elimination of redundant executions, we further developed an efficient RTL fault simulation framework, Eraser.Experimental results show that compared to commercial tools, under the same fault coverage, our framework achieves a 3.9 $\times$ improvement in simulation performance on average.
