The Vision of Autonomic Computing: Can LLMs Make It a Reality?
Zhiyang Zhang, Fangkai Yang, Xiaoting Qin, Jue Zhang, Qingwei Lin, Gong Cheng, Dongmei Zhang, Saravan Rajmohan, Qi Zhang
TL;DR
This work investigates whether Large Language Models can realize Autonomic Computing by implementing a hierarchical, LLM-driven multi-agent framework for microservice management. It defines a five-level autonomy taxonomy, deploys an online Sock Shop benchmark with chaos injections, and evaluates both low-level autonomic agents and a high-level group manager. Results show that the system achieves Level 3 autonomy, effectively detecting and mitigating issues, though full end-to-end self-healing (Level 5) remains challenging. The work demonstrates the practical viability of LLM-enabled autonomic management in cloud-native microservices and presents actionable insights and a benchmark for future research.
Abstract
The Vision of Autonomic Computing (ACV), proposed over two decades ago, envisions computing systems that self-manage akin to biological organisms, adapting seamlessly to changing environments. Despite decades of research, achieving ACV remains challenging due to the dynamic and complex nature of modern computing systems. Recent advancements in Large Language Models (LLMs) offer promising solutions to these challenges by leveraging their extensive knowledge, language understanding, and task automation capabilities. This paper explores the feasibility of realizing ACV through an LLM-based multi-agent framework for microservice management. We introduce a five-level taxonomy for autonomous service maintenance and present an online evaluation benchmark based on the Sock Shop microservice demo project to assess our framework's performance. Our findings demonstrate significant progress towards achieving Level 3 autonomy, highlighting the effectiveness of LLMs in detecting and resolving issues within microservice architectures. This study contributes to advancing autonomic computing by pioneering the integration of LLMs into microservice management frameworks, paving the way for more adaptive and self-managing computing systems. The code will be made available at https://aka.ms/ACV-LLM.
