Combined power management and congestion control in High-Speed Ethernet-based Networks for Supercomputers and Data Centers
Miguel Sánchez de la Rosa, Francisco J. andújar, Jesus Escudero-Sahuquillo, José L. Sánchez, Francisco J. Alfaro-Cortés
TL;DR
The paper tackles energy efficiency in interconnection networks for HPC and data-center settings under congestion. It proposes a combined approach that couples power management (notably PerfBound and its enhancement PerfBoundCorrect) with congestion control strategies and queueing innovations to adapt energy use to traffic conditions. Key findings show that PerfBound reduces energy consumption compared to fixed wake-up schemes, and that integrating congestion-control-aware queueing (SQS) can mitigate performance penalties, yielding net gains even under heavy load. This work provides practical guidance for designing energy-proportional, high-speed Ethernet-based networks in HPC and data-center architectures.
Abstract
The demand for computer in our daily lives has led to the proliferation of Datacenters that power indispensable many services. On the other hand, computing has become essential for some research for various scientific fields, that require Supercomputers with vast computing capabilities to produce results in reasonable time. The scale and complexity of these systems, compared to our day-to-day devices, are like comparing a cell to a living organism. To make them work properly, we need state-of-the-art technology and engineering, not just raw resources. Interconnecting the different computer nodes that make up a whole is a delicate task, as it can become the bottleneck for the whole infrastructure. In this work, we explore two aspects of the network: how to prevent degradation under heavy use with congestion control, and how to save energy when idle with power management; and how the two may interact.
