LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations
Ziyang Ye, Triet Huynh Minh Le, M. Ali Babar
TL;DR
LLMSecConfig addresses the gap in automated remediation for security misconfigurations in container orchestrators by integrating static analysis with Large Language Models via Retrieval-Augmented Generation. The framework uses Checkov for detection, rich multi-source context (SAT outputs, policy source code, Prisma Cloud docs) through RAG, and iterative, validated repair generation with YAML parsing and security checks. Across 1,000 real-world Kubernetes configurations, the open-source Mistral Large 2 model achieves a 94.3% repair rate with perfect parse/validation and minimal introduced errors, outperforming GPT-4o-mini. The work delivers an end-to-end, open dataset and implementation, enabling researchers and practitioners to advance automated container security management and reduce manual remediation effort.
Abstract
Security misconfigurations in Container Orchestrators (COs) can pose serious threats to software systems. While Static Analysis Tools (SATs) can effectively detect these security vulnerabilities, the industry currently lacks automated solutions capable of fixing these misconfigurations. The emergence of Large Language Models (LLMs), with their proven capabilities in code understanding and generation, presents an opportunity to address this limitation. This study introduces LLMSecConfig, an innovative framework that bridges this gap by combining SATs with LLMs. Our approach leverages advanced prompting techniques and Retrieval-Augmented Generation (RAG) to automatically repair security misconfigurations while preserving operational functionality. Evaluation of 1,000 real-world Kubernetes configurations achieved a 94\% success rate while maintaining a low rate of introducing new misconfigurations. Our work makes a promising step towards automated container security management, reducing the manual effort required for configuration maintenance.
