MALF: A Multi-Agent LLM Framework for Intelligent Fuzzing of Industrial Control Protocols
Bowei Ning, Xuejun Zong, Kan He
TL;DR
MALF addresses the vulnerability discovery challenges in industrial control protocols by combining multi-agent large language models with Retrieval-Augmented Generation and Quantized Low-Rank Adaptation to automate seed generation, test-case mutation, and real-time feedback. The framework achieves high protocol compliance, diverse mutations, and robust vulnerability discovery, outperforming traditional fuzzers across Modbus/TCP, S7Comm, and Ethernet/IP with TCPR around 88–92%, seed coverage above 90%, and mutation entropy between 4.2 and 4.6 bits. Real-world validation in an industrial attack-defense range yielded multiple CVEs and three zero-day findings, including CNVD-2024-16009, demonstrating practical impact for critical infrastructure security. MALF’s modular, scalable architecture—with RAG-driven knowledge retrieval, 4 specialized agents, and a ZeroMQ-based coordination layer—offers a paradigm shift toward automated, intelligent fuzzing of complex ICPs and holds promise for broader adoption in ICS cybersecurity workflows.
Abstract
Industrial control systems (ICS) are vital to modern infrastructure but increasingly vulnerable to cybersecurity threats, particularly through weaknesses in their communication protocols. This paper presents MALF (Multi-Agent LLM Fuzzing Framework), an advanced fuzzing solution that integrates large language models (LLMs) with multi-agent coordination to identify vulnerabilities in industrial control protocols (ICPs). By leveraging Retrieval-Augmented Generation (RAG) for domain-specific knowledge and QLoRA fine-tuning for protocol-aware input generation, MALF enhances fuzz testing precision and adaptability. The multi-agent framework optimizes seed generation, mutation strategies, and feedback-driven refinement, leading to improved vulnerability discovery. Experiments on protocols like Modbus/TCP, S7Comm, and Ethernet/IP demonstrate that MALF surpasses traditional methods, achieving a test case pass rate (TCPR) of 88-92% and generating more exception triggers (ETN). MALF also maintains over 90% seed coverage and Shannon entropy values between 4.2 and 4.6 bits, ensuring diverse, protocol-compliant mutations. Deployed in a real-world Industrial Attack-Defense Range for power plants, MALF identified critical vulnerabilities, including three zero-day flaws, one confirmed and registered by CNVD. These results validate MALF's effectiveness in real-world fuzzing applications. This research highlights the transformative potential of multi-agent LLMs in ICS cybersecurity, offering a scalable, automated framework that sets a new standard for vulnerability discovery and strengthens critical infrastructure security against emerging threats.
