Table of Contents
Fetching ...

ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation

Alessandro Ottaviano, Robert Balas, Giovanni Bambini, Antonio del Vecchio, Maicol Ciani, Davide Rossi, Luca Benini, Andrea Bartolini

TL;DR

ControlPULP tackles the HPC power/thermal management bottleneck by delivering an open-source, RISC-V based on-chip power controller system with a scalable multi-core cluster and dedicated DMA for real-time policy acceleration. The platform integrates a two-layer control firmware, a fast interrupt framework, and a lightweight RTOS to meet tight timing constraints, validated through FPGA-based HIL emulation that couples a detailed plant model with the controller. Key findings show a 4.9× speedup in control action with eight-worker clusters for a 72-core target and DVFS tracking within 3% of the plant's TDP, while adding minimal area overhead (~0.1% of a die). The work demonstrates that such co-design, open tooling, and HIL validation enable more sophisticated control policies and rapid iteration, potentially narrowing the gap to industrial-grade solutions like OpenPOWER. Overall, ControlPULP provides a practical, scalable path for per-core power/thermal management in future many-core HPC processors and a reusable validation framework for on-chip control research.

Abstract

High-Performance Computing (HPC) processors are nowadays integrated Cyber-Physical Systems demanding complex and high-bandwidth closed-loop power and thermal control strategies. To efficiently satisfy real-time multi-input multi-output (MIMO) optimal power requirements, high-end processors integrate an on-die power controller system (PCS). While traditional PCSs are based on a simple microcontroller (MCU)-class core, more scalable and flexible PCS architectures are required to support advanced MIMO control algorithms for managing the ever-increasing number of cores, power states, and process, voltage, and temperature variability. This paper presents ControlPULP, an open-source, HW/SW RISC-V parallel PCS platform consisting of a single-core MCU with fast interrupt handling coupled with a scalable multi-core programmable cluster accelerator and a specialized DMA engine for the parallel acceleration of real-time power management policies. ControlPULP relies on FreeRTOS to schedule a reactive power control firmware (PCF) application layer. We demonstrate ControlPULP in a power management use-case targeting a next-generation 72-core HPC processor. We first show that the multi-core cluster accelerates the PCF, achieving 4.9x speedup compared to single-core execution, enabling more advanced power management algorithms within the control hyper-period at a shallow area overhead, about 0.1% the area of a modern HPC CPU die. We then assess the PCS and PCF by designing an FPGA-based, closed-loop emulation framework that leverages the heterogeneous SoCs paradigm, achieving DVFS tracking with a mean deviation within 3% the plant's thermal design power (TDP) against a software-equivalent model-in-the-loop approach. Finally, we show that the proposed PCF compares favorably with an industry-grade control algorithm under computational-intensive workloads.

ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation

TL;DR

ControlPULP tackles the HPC power/thermal management bottleneck by delivering an open-source, RISC-V based on-chip power controller system with a scalable multi-core cluster and dedicated DMA for real-time policy acceleration. The platform integrates a two-layer control firmware, a fast interrupt framework, and a lightweight RTOS to meet tight timing constraints, validated through FPGA-based HIL emulation that couples a detailed plant model with the controller. Key findings show a 4.9× speedup in control action with eight-worker clusters for a 72-core target and DVFS tracking within 3% of the plant's TDP, while adding minimal area overhead (~0.1% of a die). The work demonstrates that such co-design, open tooling, and HIL validation enable more sophisticated control policies and rapid iteration, potentially narrowing the gap to industrial-grade solutions like OpenPOWER. Overall, ControlPULP provides a practical, scalable path for per-core power/thermal management in future many-core HPC processors and a reusable validation framework for on-chip control research.

Abstract

High-Performance Computing (HPC) processors are nowadays integrated Cyber-Physical Systems demanding complex and high-bandwidth closed-loop power and thermal control strategies. To efficiently satisfy real-time multi-input multi-output (MIMO) optimal power requirements, high-end processors integrate an on-die power controller system (PCS). While traditional PCSs are based on a simple microcontroller (MCU)-class core, more scalable and flexible PCS architectures are required to support advanced MIMO control algorithms for managing the ever-increasing number of cores, power states, and process, voltage, and temperature variability. This paper presents ControlPULP, an open-source, HW/SW RISC-V parallel PCS platform consisting of a single-core MCU with fast interrupt handling coupled with a scalable multi-core programmable cluster accelerator and a specialized DMA engine for the parallel acceleration of real-time power management policies. ControlPULP relies on FreeRTOS to schedule a reactive power control firmware (PCF) application layer. We demonstrate ControlPULP in a power management use-case targeting a next-generation 72-core HPC processor. We first show that the multi-core cluster accelerates the PCF, achieving 4.9x speedup compared to single-core execution, enabling more advanced power management algorithms within the control hyper-period at a shallow area overhead, about 0.1% the area of a modern HPC CPU die. We then assess the PCS and PCF by designing an FPGA-based, closed-loop emulation framework that leverages the heterogeneous SoCs paradigm, achieving DVFS tracking with a mean deviation within 3% the plant's thermal design power (TDP) against a software-equivalent model-in-the-loop approach. Finally, we show that the proposed PCF compares favorably with an industry-grade control algorithm under computational-intensive workloads.
Paper Structure (30 sections, 1 equation, 11 figures, 5 tables)

This paper contains 30 sections, 1 equation, 11 figures, 5 tables.

Figures (11)

  • Figure 1: High-level overview of the system. We highlight on-chip and off-chip ( and ), the (ControlPULP) and the IO interfaces. Furthermore, the figure details the phases described in Sec. \ref{['subsec:fw-archi']}
  • Figure 2: ControlPULP software stack. The application control policy () executes on top of FreeRTOS, which controls the hardware with target-specific drivers and
  • Figure 3: ControlPULP hardware architecture. On the left, the manager domain with the manager core and surrounding peripherals. On the right, the cluster domain accelerator with the eight cores (workers)
  • Figure 4: test procedure on an - with ControlPULP
  • Figure 5: ControlPULP RTL testbench simulation environment
  • ...and 6 more figures