Design and implementation of a synchronous Hardware Performance Monitor for a RISC-V space-oriented processor
Miguel Jiménez Arribas, Agustín Martínez Hellín, Manuel Prieto Mateo, Iván Gamino del Río, Andrea Fernandez Gallego, Oscar Rodríguez Polo, Antonio da Silva, Pablo Parra, Sebastián Sánchez
TL;DR
The paper addresses the need for precise timing and behavior statistics in space-grade RISC-V processors by introducing a synchronous PMU that decouples event triggering from counting and aligns event increments with instruction retirement. The decentralized event triggering across the pipeline, combined with retirement-synchronized counting, yields accurate per-instruction event attribution and easy extensibility. Validation on a RISC-V OBC using Dhrystone and CoreMark benchmarks, plus cross-platform comparisons, demonstrates correct operation, reproducibility, and a clear execution model, while incurring only modest resource and power overheads. This PMU enables enhanced observability and debugging capabilities for safety-critical onboard software, with practical implications for reliability analyses and future architectural enhancements.
Abstract
The ability to collect statistics about the execution of a program within a CPU is of the utmost importance across all fields of computing since it allows characterizing the timing performance of a program. This capability is even more relevant in safety-critical software systems, where it is mandatory to analyze software timing requirements to ensure the correct operation of the programs. Moreover, in order to properly evaluate and verify the extra-functional properties of these systems, besides timing performance, there are many other statistics available on a CPU, such as those associated with resource utilization. In this paper, we showcase a Performance Measurement Unit, also known as Hardware Performance Monitor, integrated into a RISC-V On-Board Computer designed for space applications by our research group. The monitoring technique features a novel approach whereby the events triggered are not counted immediately but instead are propagated through the pipeline so that their annotation is synchronized with the executed instruction. Additionally, we demonstrate the use of this PMU in a process to characterize the execution model of the processor. Finally, as an example of the statistics provided by the PMU, the results obtained running the CoreMark and Dhrystone benchmarks on the RISC-V OBC are shown.
