Table of Contents
Fetching ...

Generative AI in Data Center Networking: Fundamentals, Perspectives, and Case Study

Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Yonggang Wen, Dong In Kim

TL;DR

This study employs LLMs equipped with Retrieval Augmented Generation to formulate optimization problems for DCNs and adopt Diffusion-Deep Reinforcement Learning (DRL) for optimizing the RAG knowledge placement strategy, which demonstrates the application of advanced GenAI methods within DCNs.

Abstract

Generative AI (GenAI), exemplified by Large Language Models (LLMs) such as OpenAI's ChatGPT, is revolutionizing various fields. Central to this transformation is Data Center Networking (DCN), which not only provides the computational power necessary for GenAI training and inference but also delivers GenAI-driven services to users. This article examines an interplay between GenAI and DCNs, highlighting their symbiotic relationship and mutual advancements. We begin by reviewing current challenges within DCNs and discuss how GenAI contributes to enhancing DCN capabilities through innovations, such as data augmentation, process automation, and domain transfer. We then focus on analyzing the distinctive characteristics of GenAI workloads on DCNs, gaining insights that catalyze the evolution of DCNs to more effectively support GenAI and LLMs. Moreover, to illustrate the seamless integration of GenAI with DCNs, we present a case study on full-lifecycle DCN digital twins. In this study, we employ LLMs equipped with Retrieval Augmented Generation (RAG) to formulate optimization problems for DCNs and adopt Diffusion-Deep Reinforcement Learning (DRL) for optimizing the RAG knowledge placement strategy. This approach not only demonstrates the application of advanced GenAI methods within DCNs but also positions the digital twin as a pivotal GenAI service operating on DCNs. We anticipate that this article can promote further research into enhancing the virtuous interaction between GenAI and DCNs.

Generative AI in Data Center Networking: Fundamentals, Perspectives, and Case Study

TL;DR

This study employs LLMs equipped with Retrieval Augmented Generation to formulate optimization problems for DCNs and adopt Diffusion-Deep Reinforcement Learning (DRL) for optimizing the RAG knowledge placement strategy, which demonstrates the application of advanced GenAI methods within DCNs.

Abstract

Generative AI (GenAI), exemplified by Large Language Models (LLMs) such as OpenAI's ChatGPT, is revolutionizing various fields. Central to this transformation is Data Center Networking (DCN), which not only provides the computational power necessary for GenAI training and inference but also delivers GenAI-driven services to users. This article examines an interplay between GenAI and DCNs, highlighting their symbiotic relationship and mutual advancements. We begin by reviewing current challenges within DCNs and discuss how GenAI contributes to enhancing DCN capabilities through innovations, such as data augmentation, process automation, and domain transfer. We then focus on analyzing the distinctive characteristics of GenAI workloads on DCNs, gaining insights that catalyze the evolution of DCNs to more effectively support GenAI and LLMs. Moreover, to illustrate the seamless integration of GenAI with DCNs, we present a case study on full-lifecycle DCN digital twins. In this study, we employ LLMs equipped with Retrieval Augmented Generation (RAG) to formulate optimization problems for DCNs and adopt Diffusion-Deep Reinforcement Learning (DRL) for optimizing the RAG knowledge placement strategy. This approach not only demonstrates the application of advanced GenAI methods within DCNs but also positions the digital twin as a pivotal GenAI service operating on DCNs. We anticipate that this article can promote further research into enhancing the virtuous interaction between GenAI and DCNs.
Paper Structure (28 sections, 5 figures)

This paper contains 28 sections, 5 figures.

Figures (5)

  • Figure 1: Illustration of DAI and GenAI applications in DCN. The middle part illustrates the representative data center layout, with the server room, management center, and power room. Numerous servers can be connected in the three-tier, fat-tree, or DCell manners, the so-called networking layer. Finally, DAI and GenAI in the AI layer collaborate to realize efficient and secure DCN operations.
  • Figure 2: An illustration of GenAI and LLM's applications in DCNs. The number of peer-reviewed publications regarding GenAI and LLM per year is shown on the left-hand side (the publication data was collected from IEEE Xplore in Sept. 2024).
  • Figure 3: Top: The illustration of GenAI/LLM lifecycle. We can observe that most of the issues happen in the pre-training stage. Bottom: The centralized and decentralized DCNs.
  • Figure 4: A: The automatic optimization formulation. B: Diffusion-DRL for optimization solving. C & D: The comparison of optimization formulation between the backbone GPT-4 with the proposed digital twin.
  • Figure 5: The training curve and rewards of random, greedy, and the proposal knowledge placement policies.