Table of Contents
Fetching ...

CCA Reimagined: An Exploratory Study of Large Language Models for Congestion Control

Xiaoxuan Qin, Yufei Wang, Longfei Shangguan

Abstract

In this paper, we conduct an emulation-guided study to systematically investigate the feasibility of Large language model (LLM)-driven congestion control. The exploration is structured into two phases. The first phase derisks the whole capability where we isolate the role of LLM on a single yet crucial congestion avoidance phase so that we can safely examine when to invoke the LLM, what information to provide, and how to formulate LLM instructions. Based on the gained insights, we extend LLM's role to multiple congestion control phase and propose a more generic LLM-based congestion control policy. Our evaluation on both static and dynamic network traces demonstrates that the LLM-based solution can reduce latency by up to 50\% with only marginal throughput sacrifice (e.g., less than 0.3\%) compared to traditional CCAs. Overall, our exploration study confirms the potential of LLMs for adaptive and general congestion control, demonstrating that when granted appropriate control freedom and paired with an effective triggering mechanism, LLM-based policies achieve significant performance gains, particularly under highly dynamic network conditions.

CCA Reimagined: An Exploratory Study of Large Language Models for Congestion Control

Abstract

In this paper, we conduct an emulation-guided study to systematically investigate the feasibility of Large language model (LLM)-driven congestion control. The exploration is structured into two phases. The first phase derisks the whole capability where we isolate the role of LLM on a single yet crucial congestion avoidance phase so that we can safely examine when to invoke the LLM, what information to provide, and how to formulate LLM instructions. Based on the gained insights, we extend LLM's role to multiple congestion control phase and propose a more generic LLM-based congestion control policy. Our evaluation on both static and dynamic network traces demonstrates that the LLM-based solution can reduce latency by up to 50\% with only marginal throughput sacrifice (e.g., less than 0.3\%) compared to traditional CCAs. Overall, our exploration study confirms the potential of LLMs for adaptive and general congestion control, demonstrating that when granted appropriate control freedom and paired with an effective triggering mechanism, LLM-based policies achieve significant performance gains, particularly under highly dynamic network conditions.

Paper Structure

This paper contains 30 sections, 9 figures, 10 tables, 1 algorithm.

Figures (9)

  • Figure 1: Overview of the design of TCP-LLM in a hybrid WAN topology, which is developed following main questions: (1) How much freedom should LLM be granted? (Sec \ref{['sec:baseline']} VS. Sec \ref{['sec:generalized']}) Probed via abstraction of the congestion control problem into three phases and deploying LLM in different phases. (2a) When to invoke the LLM? (Sec \ref{['sec:trigger_basline']}, Sec \ref{['sec:trigger_generalized']}) Addressed through building latency-based and ACK-based trigger module. (2b) What information does LLM need? (Sec \ref{['sec:input_design_baseline']}, Sec \ref{['sec:input_design_generalized']}) Explored by comprising three different components of general instructions, network statistics features, and statistics history.
  • Figure 2: The topology used in LLM exploration.
  • Figure 3: Performance comparison of classic CCAs and TCP-LLM limited design (TCP-LLM-L, §\ref{['sec:baseline']}) and TCP-LLM general design (TCP-LLM-G, §\ref{['sec:generalized']}) on four different links: Static, Dynamic-Longisland, Dynamic-7Train, and Dynamic-QTrain.
  • Figure 4: Snapshots of TCP-LLM-L (§\ref{['sec:baseline']}), TCP-LLM-G (§\ref{['sec:generalized']}), NewReno, and Bbr: Row 1: Bandwidth changes of the bottleneck link over time. Row 2: RTT comparison over time. Row 3: CWND comparison over time. Row 4: Bottleneck link router queue size distribution for all 120s.
  • Figure 5: LLM behavior in TCP-LLM-L(§\ref{['sec:baseline']}) over 4 topologies: (a) Static, (b) Dynamic-Longisland, (c) Dynamic-7Train, and (d) Dynamic-QTrain.
  • ...and 4 more figures