AutoStreamPipe: LLM Assisted Automatic Generation of Data Stream Processing Pipelines
Abolfazl Younesi, Zahra Najafabadi Samani, Thomas Fahringer
TL;DR
AutoStreamPipe tackles the problem of converting high-level user intent into production-grade data stream processing pipelines by integrating LLM-driven intent understanding with a novel Hypergraph of Thoughts (HGoT) reasoning framework and a resilient multi-agent execution layer. The three-phase architecture (Phase 1 query analysis, Phase 2 HGOT-based graph building, Phase 3 resilient execution) enables coherent, multi-DSL pipeline synthesis across Flink, Storm, and Spark, while continuously updating knowledge via RAG and robust retries. Key contributions include end-to-end automation, a dedicated query analyzer, the HGoT formalism and tooling, a fault-tolerant multi-agent planner, an open-source prototype, and the Error-Free Score (EFS) metric for holistic evaluation. Empirical results show substantial improvements in development time ($x6.3$) and error reduction ($x5.19$), with ASP delivering high EFS across simple to complex pipelines, supporting real-world deployment and rapid iteration for domain experts.
Abstract
Data pipelines are essential in stream processing as they enable the efficient collection, processing, and delivery of real-time data, supporting rapid data analysis. In this paper, we present AutoStreamPipe, a novel framework that employs Large Language Models (LLMs) to automate the design, generation, and deployment of stream processing pipelines. AutoStreamPipe bridges the semantic gap between high-level user intent and platform-specific implementations across distributed stream processing systems for structured multi-agent reasoning by integrating a Hypergraph of Thoughts (HGoT) as an extended version of GoT. AutoStreamPipe combines resilient execution strategies, advanced query analysis, and HGoT to deliver pipelines with good accuracy. Experimental evaluations on diverse pipelines demonstrate that AutoStreamPipe significantly reduces development time (x6.3) and error rates (x5.19), as measured by a novel Error-Free Score (EFS), compared to LLM code-generation methods.
