Table of Contents
Fetching ...

Large Language Model Based Multi-Agent System Augmented Complex Event Processing Pipeline for Internet of Multimedia Things

Talha Zeeshan, Abhishek Kumar, Susanna Pirttikangas, Sasu Tarkoma

TL;DR

The paper addresses scaling complex event processing for video queries in IoT by integrating Large Language Model–based multi-agent systems with a pub/sub data fabric. It implements a proof-of-concept using the AutoGen framework and Kafka to orchestrate LLM agents within a CEP pipeline, enabling autonomous workflows across cloud and edge resources. Through experiments varying agent counts, video complexity, and resolutions, it reveals clear latency–complexity trade-offs while maintaining high narrative coherence in descriptions. The study contributes a survey of LLM-based autonomous agents, a deployable MAS-augmented CEP prototype, and practical insights for integrating such systems with existing pub/sub infrastructures in distributed AI environments.

Abstract

This paper presents the development and evaluation of a Large Language Model (LLM), also known as foundation models, based multi-agent system framework for complex event processing (CEP) with a focus on video query processing use cases. The primary goal is to create a proof-of-concept (POC) that integrates state-of-the-art LLM orchestration frameworks with publish/subscribe (pub/sub) tools to address the integration of LLMs with current CEP systems. Utilizing the Autogen framework in conjunction with Kafka message brokers, the system demonstrates an autonomous CEP pipeline capable of handling complex workflows. Extensive experiments evaluate the system's performance across varying configurations, complexities, and video resolutions, revealing the trade-offs between functionality and latency. The results show that while higher agent count and video complexities increase latency, the system maintains high consistency in narrative coherence. This research builds upon and contributes to, existing novel approaches to distributed AI systems, offering detailed insights into integrating such systems into existing infrastructures.

Large Language Model Based Multi-Agent System Augmented Complex Event Processing Pipeline for Internet of Multimedia Things

TL;DR

The paper addresses scaling complex event processing for video queries in IoT by integrating Large Language Model–based multi-agent systems with a pub/sub data fabric. It implements a proof-of-concept using the AutoGen framework and Kafka to orchestrate LLM agents within a CEP pipeline, enabling autonomous workflows across cloud and edge resources. Through experiments varying agent counts, video complexity, and resolutions, it reveals clear latency–complexity trade-offs while maintaining high narrative coherence in descriptions. The study contributes a survey of LLM-based autonomous agents, a deployable MAS-augmented CEP prototype, and practical insights for integrating such systems with existing pub/sub infrastructures in distributed AI environments.

Abstract

This paper presents the development and evaluation of a Large Language Model (LLM), also known as foundation models, based multi-agent system framework for complex event processing (CEP) with a focus on video query processing use cases. The primary goal is to create a proof-of-concept (POC) that integrates state-of-the-art LLM orchestration frameworks with publish/subscribe (pub/sub) tools to address the integration of LLMs with current CEP systems. Utilizing the Autogen framework in conjunction with Kafka message brokers, the system demonstrates an autonomous CEP pipeline capable of handling complex workflows. Extensive experiments evaluate the system's performance across varying configurations, complexities, and video resolutions, revealing the trade-offs between functionality and latency. The results show that while higher agent count and video complexities increase latency, the system maintains high consistency in narrative coherence. This research builds upon and contributes to, existing novel approaches to distributed AI systems, offering detailed insights into integrating such systems into existing infrastructures.
Paper Structure (35 sections, 21 figures, 3 tables)

This paper contains 35 sections, 21 figures, 3 tables.

Figures (21)

  • Figure 1: A General Publish/Subscribe Architecture
  • Figure 2: Proposed GenAI Broker Architecture - Adapted from saleh2023pubsub with Explicit Permission
  • Figure 3: One possible architecture of a system adhering to the neural pub/sub paradigm
  • Figure 4: A High-Level System Overview
  • Figure 5: Bi-directional communication between an AutoGen agent and external tools
  • ...and 16 more figures