Using a Feedback Loop for LLM-based Infrastructure as Code Generation
Mayur Amarnath Palavalli, Mark Santolucito
TL;DR
The paper investigates whether an LLM agent can generate AWS CloudFormation IaC and improve through a feedback loop that uses cfn-lint to provide error and warning messages. It builds a benchmark of 165 templates from 33 prompts to evaluate iterative revisions and reports that the feedback loop's effectiveness decays exponentially with each iteration, plateauing around iteration 5–6. The findings indicate that while LLM-assisted IaC generation benefits from automated feedback, achieving scalable, semantically correct infrastructure requires additional validation strategies and improved feedback mechanisms. This work highlights the gap between syntactic validity and semantic correctness in IaC generation and points to future research directions to reach robust, production-grade tooling.
Abstract
Code generation with Large Language Models (LLMs) has helped to increase software developer productivity in coding tasks, but has yet to have significant impact on the tasks of software developers that surround this code. In particular, the challenge of infrastructure management remains an open question. We investigate the ability of an LLM agent to construct infrastructure using the Infrastructure as Code (IaC) paradigm. We particularly investigate the use of a feedback loop that returns errors and warnings on the generated IaC to allow the LLM agent to improve the code. We find that, for each iteration of the loop, its effectiveness decreases exponentially until it plateaus at a certain point and becomes ineffective.
