Table of Contents
Fetching ...

IEMAS: An Incentive-Efficiency Routing Framework for Open Agentic Web Ecosystems

Hongze Liu, Chang Guo, Yingzeng Li, Mengru Wang, Jiong Lou, Shijing Yuan, Hefeng Zhou, Chentao Wu, Jie LI

Abstract

The transition to open, distributed Multi-Agent Systems (MAS) promises scalable intelligence but introduces a non-trivial tension: maximizing global efficiency requires cooperative, resource-aware scheduling, yet autonomous agents may be self-interested and cannot be managed by a centralized controller. Prior approaches fall short in two key areas: they typically focus on single-query routing, neglecting long-term resource reuse (e.g., KV-caching) and the complexities of system-level many-to-many matching; furthermore, they rely on generic incentive mechanisms that ignore the distinct characteristics of LLM inference. To bridge this gap, we propose IEMAS (Incentive-Efficiency Mechanism for Multi-Agent Systems), a distributed framework that aligns economic incentives with system performance. IEMAS integrates a probabilistic predictive model to estimate Quality of Service (QoS) under uncertainty, which feeds into a VCG-based bipartite matching mechanism. This design guarantees truthful capability reporting and social optimality while explicitly leveraging KV cache-affinity to minimize computational redundancy. We implement IEMAS on top of vLLM and evaluate it via extensive simulations. Results demonstrate that our incentive-efficiency co-design reducing average service cost by 35% and end-to-end latency by up to 2.9 compared to baselines.

IEMAS: An Incentive-Efficiency Routing Framework for Open Agentic Web Ecosystems

Abstract

The transition to open, distributed Multi-Agent Systems (MAS) promises scalable intelligence but introduces a non-trivial tension: maximizing global efficiency requires cooperative, resource-aware scheduling, yet autonomous agents may be self-interested and cannot be managed by a centralized controller. Prior approaches fall short in two key areas: they typically focus on single-query routing, neglecting long-term resource reuse (e.g., KV-caching) and the complexities of system-level many-to-many matching; furthermore, they rely on generic incentive mechanisms that ignore the distinct characteristics of LLM inference. To bridge this gap, we propose IEMAS (Incentive-Efficiency Mechanism for Multi-Agent Systems), a distributed framework that aligns economic incentives with system performance. IEMAS integrates a probabilistic predictive model to estimate Quality of Service (QoS) under uncertainty, which feeds into a VCG-based bipartite matching mechanism. This design guarantees truthful capability reporting and social optimality while explicitly leveraging KV cache-affinity to minimize computational redundancy. We implement IEMAS on top of vLLM and evaluate it via extensive simulations. Results demonstrate that our incentive-efficiency co-design reducing average service cost by 35% and end-to-end latency by up to 2.9 compared to baselines.
Paper Structure (34 sections, 3 theorems, 17 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 34 sections, 3 theorems, 17 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Theorem 4.1

The assignment $x^*$ produced by the Min-Cost Max-Flow (MCMF) algorithm in the IEMAS flow network maximizes the total social welfare $W(\mathcal{C})$ subject to capacity constraints.

Figures (8)

  • Figure 1: The Illustration of Agentic Web Routing.
  • Figure 2: IEMAS Overview. (a) Coarse-Grained Clustering: Incoming web queries are first allocated to specific Agent Hubs via a fast, domain-based clustering mechanism. (b) Predictive Auction: A proxy layer utilizes predictive modeling to generate uncertainty-aware bids/asks and executes an auction to match tasks to agents under capacity constraints. (c) Optimization: The allocation is solved as a Min-Cost Max-Flow (MCMF) problem to maximize social welfare based on truthful bidding.
  • Figure 3: The Predictive Model for QoS Factors of CoQA dataset.
  • Figure 4: Social Welfare Comparison.
  • Figure 5: The Utility under Different Auction Strategies in VCG Auction.
  • ...and 3 more figures

Theorems & Definitions (6)

  • Theorem 4.1: Allocative Efficiency
  • proof
  • Theorem 4.2: Dominant Strategy Incentive Compatibility for Clients
  • proof
  • Theorem 4.3: Weak Budget Balance
  • proof