Table of Contents
Fetching ...

DuCCAE: A Hybrid Engine for Immersive Conversation via Collaboration, Augmentation, and Evolution

Xin Shen, Zhishu Jiang, Jiaye Yang, Haibo Liu, Yichen Wan, Jiarui Zhang, Tingzhi Dai, Luodong Xu, Shuchen Wu, Guanqiang QI, Chenxi Miao, Jiahui Liang, Yang Li, Weikang Li, Deguo Xia, Jizhou Huang

Abstract

Immersive conversational systems in production face a persistent trade-off between responsiveness and long-horizon task capability. Real-time interaction is achievable for lightweight turns, but requests involving planning and tool invocation (e.g., search and media generation) produce heavy-tail execution latency that degrades turn-taking, persona consistency, and user trust. To address this challenge, we propose DuCCAE (Conversation while Collaboration with Augmentation and Evolution), a hybrid engine for immersive conversation deployed within Baidu Search, serving millions of users. DuCCAE decouples real-time response generation from asynchronous agentic execution and synchronizes them via a shared state that maintains session context and execution traces, enabling asynchronous results to be integrated back into the ongoing dialogue. The system orchestrates five subsystems-Info, Conversation, Collaboration, Augmentation, and Evolution-to support multi-agent collaboration and continuous improvement. We evaluate DuCCAE through a comprehensive framework that combines offline benchmarking on the Du-Interact dataset and large-scale production evaluation within Baidu Search. Experimental results demonstrate that DuCCAE outperforms strong baselines in agentic execution reliability and dialogue quality while reducing latency to fit strict real-time budgets. Crucially, deployment metrics since June 2025 confirm substantial real-world effectiveness, evidenced by a tripling of Day-7 user retention to 34.2% and a surge in the complex task completion rate to 65.2%. Our hybrid architecture successfully preserves conversational continuity while enabling reliable agentic execution, offering practical guidelines for deploying scalable agentic systems in industrial settings.

DuCCAE: A Hybrid Engine for Immersive Conversation via Collaboration, Augmentation, and Evolution

Abstract

Immersive conversational systems in production face a persistent trade-off between responsiveness and long-horizon task capability. Real-time interaction is achievable for lightweight turns, but requests involving planning and tool invocation (e.g., search and media generation) produce heavy-tail execution latency that degrades turn-taking, persona consistency, and user trust. To address this challenge, we propose DuCCAE (Conversation while Collaboration with Augmentation and Evolution), a hybrid engine for immersive conversation deployed within Baidu Search, serving millions of users. DuCCAE decouples real-time response generation from asynchronous agentic execution and synchronizes them via a shared state that maintains session context and execution traces, enabling asynchronous results to be integrated back into the ongoing dialogue. The system orchestrates five subsystems-Info, Conversation, Collaboration, Augmentation, and Evolution-to support multi-agent collaboration and continuous improvement. We evaluate DuCCAE through a comprehensive framework that combines offline benchmarking on the Du-Interact dataset and large-scale production evaluation within Baidu Search. Experimental results demonstrate that DuCCAE outperforms strong baselines in agentic execution reliability and dialogue quality while reducing latency to fit strict real-time budgets. Crucially, deployment metrics since June 2025 confirm substantial real-world effectiveness, evidenced by a tripling of Day-7 user retention to 34.2% and a surge in the complex task completion rate to 65.2%. Our hybrid architecture successfully preserves conversational continuity while enabling reliable agentic execution, offering practical guidelines for deploying scalable agentic systems in industrial settings.
Paper Structure (42 sections, 1 equation, 7 figures, 3 tables)

This paper contains 42 sections, 1 equation, 7 figures, 3 tables.

Figures (7)

  • Figure 1: DuCCAE interface and interaction flow in production. Starting from a Baidu Search entry, the system maintains persona-consistent chat responses and supports escalation to real-time calling; complex requests trigger asynchronous collaboration and tool augmentation while preserving conversational continuity.
  • Figure 2: DuCCAE: an evolving agentic service engine for immersive conversational interaction. Info System converts multimodal signals into policy-aware context and manages memory. Conversation System acts as a low-latency gatekeeper for intent routing and renders persona-consistent responses. Collaboration System supports multi-agent execution for long-horizon tasks via planning and tool use. Augmentation System empowers agents with external tools, retrieval resources, and execution protocols. Evolution System drives continuous improvement through episode-based judging and post-training.
  • Figure 3: Runtime Execution Dataflow of a Mixed-Intent Interaction Episode. Processing the input "Exhausted... Plan a trip", the Fast Track (top) utilizes User Memory ("Hobby: Basketball") for an immediate empathetic inquiry. Simultaneously, the Slow Track (bottom) decomposes the travel request, demonstrating: (1) Parallel Execution of independent flight and activity searches to optimize latency; and (2) Constraint Fusion, which filters dining options via the memory constraint ("Dislikes Raw Fish") to select the compatible "Wagyu Beef".
  • Figure 4: Evolution of Key Business Metrics across System Iterations. Statistics are derived from large-scale online controlled experiments in Baidu Search. From V1 to V3, DuCCAE demonstrates consistent growth in: (a) User Stickiness; (b) Conversation Quality; (c) Interaction Depth; and (d) Agentic Capability.
  • Figure 5: Core system prompt in DuCCAE-V3 for intent routing and planning. The structured schema ensures strict compliance with the dual-track architecture and deterministic execution.
  • ...and 2 more figures