LADs: Leveraging LLMs for AI-Driven DevOps

Ahmad Faraz Khan; Azal Ahmad Khan; Anas Mohamed; Haider Ali; Suchithra Moolinti; Sabaat Haroon; Usman Tahir; Mattia Fazzini; Ali R. Butt; Ali Anwar

LADs: Leveraging LLMs for AI-Driven DevOps

Ahmad Faraz Khan, Azal Ahmad Khan, Anas Mohamed, Haider Ali, Suchithra Moolinti, Sabaat Haroon, Usman Tahir, Mattia Fazzini, Ali R. Butt, Ali Anwar

TL;DR

The paper tackles the automation of cloud configuration in dynamic, heterogeneous environments. It proposes LADs, an agentic LLM framework that fuses instruction prompting, retrieval-augmented generation, few-shot learning, chain-of-thought reasoning, and feedback-based prompt chaining to generate and maintain cloud configurations with minimal human input. The authors introduce static and dynamic benchmarks, spanning Dask, Redis, and Ray, to evaluate alignment with user intent, performance, and cost. Their results show that LADs reduces manual effort, optimizes resource utilization, and improves reliability, with cost-efficient, smaller LLMs achieving strong performance; the work is released as open source to spur further AI-powered DevOps innovation.

Abstract

Automating cloud configuration and deployment remains a critical challenge due to evolving infrastructures, heterogeneous hardware, and fluctuating workloads. Existing solutions lack adaptability and require extensive manual tuning, leading to inefficiencies and misconfigurations. We introduce LADs, the first LLM-driven framework designed to tackle these challenges by ensuring robustness, adaptability, and efficiency in automated cloud management. Instead of merely applying existing techniques, LADs provides a principled approach to configuration optimization through in-depth analysis of what optimization works under which conditions. By leveraging Retrieval-Augmented Generation, Few-Shot Learning, Chain-of-Thought, and Feedback-Based Prompt Chaining, LADs generates accurate configurations and learns from deployment failures to iteratively refine system settings. Our findings reveal key insights into the trade-offs between performance, cost, and scalability, helping practitioners determine the right strategies for different deployment scenarios. For instance, we demonstrate how prompt chaining-based adaptive feedback loops enhance fault tolerance in multi-tenant environments and how structured log analysis with example shots improves configuration accuracy. Through extensive evaluations, LADs reduces manual effort, optimizes resource utilization, and improves system reliability. By open-sourcing LADs, we aim to drive further innovation in AI-powered DevOps automation.

LADs: Leveraging LLMs for AI-Driven DevOps

TL;DR

Abstract

LADs: Leveraging LLMs for AI-Driven DevOps

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)