Table of Contents
Fetching ...

The Vision of Autonomic Computing: Can LLMs Make It a Reality?

Zhiyang Zhang, Fangkai Yang, Xiaoting Qin, Jue Zhang, Qingwei Lin, Gong Cheng, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

TL;DR

This work investigates whether Large Language Models can realize Autonomic Computing by implementing a hierarchical, LLM-driven multi-agent framework for microservice management. It defines a five-level autonomy taxonomy, deploys an online Sock Shop benchmark with chaos injections, and evaluates both low-level autonomic agents and a high-level group manager. Results show that the system achieves Level 3 autonomy, effectively detecting and mitigating issues, though full end-to-end self-healing (Level 5) remains challenging. The work demonstrates the practical viability of LLM-enabled autonomic management in cloud-native microservices and presents actionable insights and a benchmark for future research.

Abstract

The Vision of Autonomic Computing (ACV), proposed over two decades ago, envisions computing systems that self-manage akin to biological organisms, adapting seamlessly to changing environments. Despite decades of research, achieving ACV remains challenging due to the dynamic and complex nature of modern computing systems. Recent advancements in Large Language Models (LLMs) offer promising solutions to these challenges by leveraging their extensive knowledge, language understanding, and task automation capabilities. This paper explores the feasibility of realizing ACV through an LLM-based multi-agent framework for microservice management. We introduce a five-level taxonomy for autonomous service maintenance and present an online evaluation benchmark based on the Sock Shop microservice demo project to assess our framework's performance. Our findings demonstrate significant progress towards achieving Level 3 autonomy, highlighting the effectiveness of LLMs in detecting and resolving issues within microservice architectures. This study contributes to advancing autonomic computing by pioneering the integration of LLMs into microservice management frameworks, paving the way for more adaptive and self-managing computing systems. The code will be made available at https://aka.ms/ACV-LLM.

The Vision of Autonomic Computing: Can LLMs Make It a Reality?

TL;DR

This work investigates whether Large Language Models can realize Autonomic Computing by implementing a hierarchical, LLM-driven multi-agent framework for microservice management. It defines a five-level autonomy taxonomy, deploys an online Sock Shop benchmark with chaos injections, and evaluates both low-level autonomic agents and a high-level group manager. Results show that the system achieves Level 3 autonomy, effectively detecting and mitigating issues, though full end-to-end self-healing (Level 5) remains challenging. The work demonstrates the practical viability of LLM-enabled autonomic management in cloud-native microservices and presents actionable insights and a benchmark for future research.

Abstract

The Vision of Autonomic Computing (ACV), proposed over two decades ago, envisions computing systems that self-manage akin to biological organisms, adapting seamlessly to changing environments. Despite decades of research, achieving ACV remains challenging due to the dynamic and complex nature of modern computing systems. Recent advancements in Large Language Models (LLMs) offer promising solutions to these challenges by leveraging their extensive knowledge, language understanding, and task automation capabilities. This paper explores the feasibility of realizing ACV through an LLM-based multi-agent framework for microservice management. We introduce a five-level taxonomy for autonomous service maintenance and present an online evaluation benchmark based on the Sock Shop microservice demo project to assess our framework's performance. Our findings demonstrate significant progress towards achieving Level 3 autonomy, highlighting the effectiveness of LLMs in detecting and resolving issues within microservice architectures. This study contributes to advancing autonomic computing by pioneering the integration of LLMs into microservice management frameworks, paving the way for more adaptive and self-managing computing systems. The code will be made available at https://aka.ms/ACV-LLM.
Paper Structure (19 sections, 4 figures, 17 tables)

This paper contains 19 sections, 4 figures, 17 tables.

Figures (4)

  • Figure 1: Hierarchical service management framework with LLM-based multi-agent design.
  • Figure 2: Illustration of the Sock Shop example: Each microservice is converted into an LLM-based autonomic agent, managed by an autonomic layer comprising a Planner and Executor. The high-level group manager employs a Plan-Execute feedback mechanism to generate plans and assign sub-tasks to low-level autonomic agents. A decoupled message queue system serves as the middleware to manage communication, collecting feedback and unresolved issues.
  • Figure 3: Taxonomy of autonomous levels in service maintenance, focusing on Self-Healing and Self-Optimization.
  • Figure 4: Sequence diagram for the Latency Reduction-Group task applied to the high-level group manager.