Table of Contents
Fetching ...

KIMAs: A Configurable Knowledge Integrated Multi-Agent System

Zitao Li, Fei Wei, Yuexiang Xie, Dawei Gao, Weirui Kuang, Zhijian Ma, Bingchen Qian, Yaliang Li, Bolin Ding

TL;DR

KIMAs tackles the challenges of deploying knowledge-intensive QA with LLMs in real-world settings by introducing a configurable knowledge-integrated multi-agent system. The approach combines context-based query rewriting, efficient multi-source retrieval, and a parallelizable pipeline orchestrated by three core agents—context manager, retrieval, and summarizer—along with multiple routing and citation strategies. Key contributions include modular agent design, flexible knowledge-routing mechanisms (embedding clustering, manual mix-ins, score scaling), and a two-stage citation process to maintain trust without extra latency. The framework is demonstrated across three configurations (AgentScope QA, ModelScope QA, Olympic Bot on Weibo) to show robustness, scalability, and practical impact in diverse knowledge environments.

Abstract

Knowledge-intensive conversations supported by large language models (LLMs) have become one of the most popular and helpful applications that can assist people in different aspects. Many current knowledge-intensive applications are centered on retrieval-augmented generation (RAG) techniques. While many open-source RAG frameworks facilitate the development of RAG-based applications, they often fall short in handling practical scenarios complicated by heterogeneous data in topics and formats, conversational context management, and the requirement of low-latency response times. This technical report presents a configurable knowledge integrated multi-agent system, KIMAs, to address these challenges. KIMAs features a flexible and configurable system for integrating diverse knowledge sources with 1) context management and query rewrite mechanisms to improve retrieval accuracy and multi-turn conversational coherency, 2) efficient knowledge routing and retrieval, 3) simple but effective filter and reference generation mechanisms, and 4) optimized parallelizable multi-agent pipeline execution. Our work provides a scalable framework for advancing the deployment of LLMs in real-world settings. To show how KIMAs can help developers build knowledge-intensive applications with different scales and emphases, we demonstrate how we configure the system to three applications already running in practice with reliable performance.

KIMAs: A Configurable Knowledge Integrated Multi-Agent System

TL;DR

KIMAs tackles the challenges of deploying knowledge-intensive QA with LLMs in real-world settings by introducing a configurable knowledge-integrated multi-agent system. The approach combines context-based query rewriting, efficient multi-source retrieval, and a parallelizable pipeline orchestrated by three core agents—context manager, retrieval, and summarizer—along with multiple routing and citation strategies. Key contributions include modular agent design, flexible knowledge-routing mechanisms (embedding clustering, manual mix-ins, score scaling), and a two-stage citation process to maintain trust without extra latency. The framework is demonstrated across three configurations (AgentScope QA, ModelScope QA, Olympic Bot on Weibo) to show robustness, scalability, and practical impact in diverse knowledge environments.

Abstract

Knowledge-intensive conversations supported by large language models (LLMs) have become one of the most popular and helpful applications that can assist people in different aspects. Many current knowledge-intensive applications are centered on retrieval-augmented generation (RAG) techniques. While many open-source RAG frameworks facilitate the development of RAG-based applications, they often fall short in handling practical scenarios complicated by heterogeneous data in topics and formats, conversational context management, and the requirement of low-latency response times. This technical report presents a configurable knowledge integrated multi-agent system, KIMAs, to address these challenges. KIMAs features a flexible and configurable system for integrating diverse knowledge sources with 1) context management and query rewrite mechanisms to improve retrieval accuracy and multi-turn conversational coherency, 2) efficient knowledge routing and retrieval, 3) simple but effective filter and reference generation mechanisms, and 4) optimized parallelizable multi-agent pipeline execution. Our work provides a scalable framework for advancing the deployment of LLMs in real-world settings. To show how KIMAs can help developers build knowledge-intensive applications with different scales and emphases, we demonstrate how we configure the system to three applications already running in practice with reliable performance.

Paper Structure

This paper contains 26 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: KIMAs system with agentive modularization and configurable pipeline.
  • Figure 2: Query rewrite mechanisms in KIMAs.
  • Figure 3: A simple visualization of the routing mechanism. Because the query embedding is closer to centroids in Agent A's knowledge domain, Agent A is roused to conduct knowledge retrieval.
  • Figure 4: Use case in Q&A group of AgentScope
  • Figure 5: Three demonstration QA pairs using different knowledge resources in https://www.modelscope.cn/studios/AI-ModelScope/modelscope_copilot_beta.