Table of Contents
Fetching ...

Disaggregated Architectures and the Redesign of Data Center Ecosystems: Scheduling, Pooling, and Infrastructure Trade-offs

Chao Guo, Jiahe Xu, Moshe Zukerman

TL;DR

This paper surveys hardware disaggregation as a pathway to transform data center ecosystems into flexible resource pools, highlighting motivations, industry progress (notably CXL-based pooling and memory tiering), and the need for cross-layer co-design of pooling, scheduling, and infrastructure. It situates a numerical case study and ILP-based evaluation to compare pool configurations, demonstrating how design choices impact utilization and total cost. The authors discuss challenges across pool design, diversified disaggregation scales, integration with traditional servers, and power/cooling constraints, arguing for cross-layer optimization and adaptive orchestration. Overall, the work underscores the potential of disaggregation to reshape DCs while acknowledging practical hurdles and outlining concrete research directions.

Abstract

Hardware disaggregation seeks to transform Data Center (DC) resources from traditional server fleets into unified resource pools. Despite existing challenges that may hinder its full realization, significant progress has been made in both industry and academia. In this article, we provide an overview of the motivations and recent advancements in hardware disaggregation. We further discuss the research challenges and opportunities associated with disaggregated architectures, focusing on aspects that have received limited attention. We argue that hardware disaggregation has the potential to reshape the entire DC ecosystem, impacting application design, resource scheduling, hardware configuration, cooling, and power system optimization. Additionally, we present a numerical study to illustrate several key aspects of these challenges.

Disaggregated Architectures and the Redesign of Data Center Ecosystems: Scheduling, Pooling, and Infrastructure Trade-offs

TL;DR

This paper surveys hardware disaggregation as a pathway to transform data center ecosystems into flexible resource pools, highlighting motivations, industry progress (notably CXL-based pooling and memory tiering), and the need for cross-layer co-design of pooling, scheduling, and infrastructure. It situates a numerical case study and ILP-based evaluation to compare pool configurations, demonstrating how design choices impact utilization and total cost. The authors discuss challenges across pool design, diversified disaggregation scales, integration with traditional servers, and power/cooling constraints, arguing for cross-layer optimization and adaptive orchestration. Overall, the work underscores the potential of disaggregation to reshape DCs while acknowledging practical hurdles and outlining concrete research directions.

Abstract

Hardware disaggregation seeks to transform Data Center (DC) resources from traditional server fleets into unified resource pools. Despite existing challenges that may hinder its full realization, significant progress has been made in both industry and academia. In this article, we provide an overview of the motivations and recent advancements in hardware disaggregation. We further discuss the research challenges and opportunities associated with disaggregated architectures, focusing on aspects that have received limited attention. We argue that hardware disaggregation has the potential to reshape the entire DC ecosystem, impacting application design, resource scheduling, hardware configuration, cooling, and power system optimization. Additionally, we present a numerical study to illustrate several key aspects of these challenges.

Paper Structure

This paper contains 18 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: An idealized disaggregated architecture.
  • Figure 2: Cooling, power, and resource management for a DDC.
  • Figure 3: Resource utilization under different pool configurations.
  • Figure 4: Cost under different pool configurations.