Table of Contents
Fetching ...

Scalable and Efficient Intra- and Inter-node Interconnection Networks for Post-Exascale Supercomputers and Data centers

Joaquin Tarraga-Moreno, Daniel Barley, Francisco J. Andujar Munoz, Jesus Escudero-Sahuquillo, Holger Froning, Pedro Javier Garcia, Francisco J. Quiles, Jose Duato

TL;DR

Addresses the bottlenecks of communicating among accelerators in scalable, heterogeneous HPC/data-center systems. The paper analyzes the NVIDIA DGX GH200 configuration and slimmed fat-tree topologies, including a routing method for balanced traffic across XGFTs, and develops a detailed node- and cluster-level network representation for simulation. Key contributions include descriptive characterization of GH200's intra- and inter-node fabrics, evaluation of slimmed fat-tree designs, and guidance on scalable interconnect architectures under post-exascale workloads. The findings demonstrate that carefully staged, hierarchical networks with efficient routing can deliver high throughput while reducing cost and complexity, enabling efficient AI, HPC, and analytics workloads.

Abstract

The rapid growth of data-intensive applications such as generative AI, scientific simulations, and large-scale analytics is driving modern supercomputers and data centers toward increasingly heterogeneous and tightly integrated architectures. These systems combine powerful CPUs and accelerators with emerging high-bandwidth memory and storage technologies to reduce data movement and improve computational efficiency. However, as the number of accelerators per node increases, communication bottlenecks emerge both within and between nodes, particularly when network resources are shared among heterogeneous components.

Scalable and Efficient Intra- and Inter-node Interconnection Networks for Post-Exascale Supercomputers and Data centers

TL;DR

Addresses the bottlenecks of communicating among accelerators in scalable, heterogeneous HPC/data-center systems. The paper analyzes the NVIDIA DGX GH200 configuration and slimmed fat-tree topologies, including a routing method for balanced traffic across XGFTs, and develops a detailed node- and cluster-level network representation for simulation. Key contributions include descriptive characterization of GH200's intra- and inter-node fabrics, evaluation of slimmed fat-tree designs, and guidance on scalable interconnect architectures under post-exascale workloads. The findings demonstrate that carefully staged, hierarchical networks with efficient routing can deliver high throughput while reducing cost and complexity, enabling efficient AI, HPC, and analytics workloads.

Abstract

The rapid growth of data-intensive applications such as generative AI, scientific simulations, and large-scale analytics is driving modern supercomputers and data centers toward increasingly heterogeneous and tightly integrated architectures. These systems combine powerful CPUs and accelerators with emerging high-bandwidth memory and storage technologies to reduce data movement and improve computational efficiency. However, as the number of accelerators per node increases, communication bottlenecks emerge both within and between nodes, particularly when network resources are shared among heterogeneous components.

Paper Structure

This paper contains 6 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Grace Hopper Superchip.
  • Figure 2: Node architecture.
  • Figure 3: Compute tray architecture.
  • Figure 4: DGX GH200 architecture.
  • Figure 5: DGX GH200 cluster performance as a function of traffic load (%).