Table of Contents
Fetching ...

AgenticCyber: A GenAI-Powered Multi-Agent System for Multimodal Threat Detection and Adaptive Response in Cybersecurity

Shovan Roy

TL;DR

AgenticCyber presents a GenAI-powered multi-agent framework that fuses cloud logs, video, and audio for real-time multimodal threat detection and adaptive response. The system leverages Gemini for cross-modal reasoning and LangChain for orchestrating specialized agents, achieving high detection accuracy and dramatically reduced response times. Key contributions include an attention-based fusion mechanism within a POMDP-guided orchestration, and validated MTTR reductions on diverse datasets. The work demonstrates a scalable, modular approach to proactive cybersecurity across enterprise networks and IoT ecosystems, addressing siloed security technologies through cross-modal reasoning and automated remediation.

Abstract

The increasing complexity of cyber threats in distributed environments demands advanced frameworks for real-time detection and response across multimodal data streams. This paper introduces AgenticCyber, a generative AI powered multi-agent system that orchestrates specialized agents to monitor cloud logs, surveillance videos, and environmental audio concurrently. The solution achieves 96.2% F1-score in threat detection, reduces response latency to 420 ms, and enables adaptive security posture management using multimodal language models like Google's Gemini coupled with LangChain for agent orchestration. Benchmark datasets, such as AWS CloudTrail logs, UCF-Crime video frames, and UrbanSound8K audio clips, show greater performance over standard intrusion detection systems, reducing mean time to respond (MTTR) by 65% and improving situational awareness. This work introduces a scalable, modular proactive cybersecurity architecture for enterprise networks and IoT ecosystems that overcomes siloed security technologies with cross-modal reasoning and automated remediation.

AgenticCyber: A GenAI-Powered Multi-Agent System for Multimodal Threat Detection and Adaptive Response in Cybersecurity

TL;DR

AgenticCyber presents a GenAI-powered multi-agent framework that fuses cloud logs, video, and audio for real-time multimodal threat detection and adaptive response. The system leverages Gemini for cross-modal reasoning and LangChain for orchestrating specialized agents, achieving high detection accuracy and dramatically reduced response times. Key contributions include an attention-based fusion mechanism within a POMDP-guided orchestration, and validated MTTR reductions on diverse datasets. The work demonstrates a scalable, modular approach to proactive cybersecurity across enterprise networks and IoT ecosystems, addressing siloed security technologies through cross-modal reasoning and automated remediation.

Abstract

The increasing complexity of cyber threats in distributed environments demands advanced frameworks for real-time detection and response across multimodal data streams. This paper introduces AgenticCyber, a generative AI powered multi-agent system that orchestrates specialized agents to monitor cloud logs, surveillance videos, and environmental audio concurrently. The solution achieves 96.2% F1-score in threat detection, reduces response latency to 420 ms, and enables adaptive security posture management using multimodal language models like Google's Gemini coupled with LangChain for agent orchestration. Benchmark datasets, such as AWS CloudTrail logs, UCF-Crime video frames, and UrbanSound8K audio clips, show greater performance over standard intrusion detection systems, reducing mean time to respond (MTTR) by 65% and improving situational awareness. This work introduces a scalable, modular proactive cybersecurity architecture for enterprise networks and IoT ecosystems that overcomes siloed security technologies with cross-modal reasoning and automated remediation.

Paper Structure

This paper contains 21 sections, 1 equation, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: AgenticCyber architecture, depicting agent interactions, data flows, and GenAI integration via LangChain chains.
  • Figure 2: Threat Model
  • Figure 3: Ablation study: F1-score and latency across variants (with/without fusion, GenAI) under varying loads.