Table of Contents
Fetching ...

ENCO: Life-Cycle Management of Enterprise-Grade Copilots

Yiwen Zhu, Mathieu Demarne, Kai Deng, Wenjing Wang, Nutan Sahoo, Divya Vermareddy, Hannah Lerner, Yunlei Lu, Swati Bararia, Anjali Bhavan, William Zhang, Xia Li, Katherine Lin, Miso Cilimdzic, Subru Krishnan

TL;DR

ENCO delivers a production-ready framework for lifecycle-managed enterprise copilots that fuse diverse data sources with retrieval-augmented generation. It introduces NL2SearchQuery and a lightweight hierarchical agentic planner to support flexible, low-latency information retrieval across IcM, TSG, and code, while preserving privacy and MLOps rigor. The architecture integrates offline preprocessing, DAG-based backend orchestration, and comprehensive offline/online evaluation, and has seen broad adoption at Microsoft with substantial on-call efficiency gains. Practical impact includes improved incident triage speed, multi-turn interactions, and robust governance, with ongoing work toward autonomous actions and scale-driven feedback integration.

Abstract

Software engineers frequently grapple with the challenge of accessing disparate documentation and telemetry data, including TroubleShooting Guides (TSGs), incident reports, code repositories, and various internal tools developed by multiple stakeholders. While on-call duties are inevitable, incident resolution becomes even more daunting due to the obscurity of legacy sources and the pressures of strict time constraints. To enhance the efficiency of on-call engineers (OCEs) and streamline their daily workflows, we introduced DECO-a comprehensive framework for developing, deploying, and managing enterprise-grade copilots tailored to improve productivity in engineering routines. This paper details the design and implementation of the DECO framework, emphasizing its innovative NL2SearchQuery functionality and a lightweight agentic framework. These features support efficient and customized retrieval-augmented-generation (RAG) algorithms that not only extract relevant information from diverse sources but also select the most pertinent skills in response to user queries. This enables the addressing of complex technical questions and provides seamless, automated access to internal resources. Additionally, DECO incorporates a robust mechanism for converting unstructured incident logs into user-friendly, structured guides, effectively bridging the documentation gap. Since its launch in September 2023, ENCO has demonstrated its effectiveness through widespread adoption, enabling tens of thousands of interactions and engaging hundreds of monthly active users (MAU) across dozens of organizations within the company.

ENCO: Life-Cycle Management of Enterprise-Grade Copilots

TL;DR

ENCO delivers a production-ready framework for lifecycle-managed enterprise copilots that fuse diverse data sources with retrieval-augmented generation. It introduces NL2SearchQuery and a lightweight hierarchical agentic planner to support flexible, low-latency information retrieval across IcM, TSG, and code, while preserving privacy and MLOps rigor. The architecture integrates offline preprocessing, DAG-based backend orchestration, and comprehensive offline/online evaluation, and has seen broad adoption at Microsoft with substantial on-call efficiency gains. Practical impact includes improved incident triage speed, multi-turn interactions, and robust governance, with ongoing work toward autonomous actions and scale-driven feedback integration.

Abstract

Software engineers frequently grapple with the challenge of accessing disparate documentation and telemetry data, including TroubleShooting Guides (TSGs), incident reports, code repositories, and various internal tools developed by multiple stakeholders. While on-call duties are inevitable, incident resolution becomes even more daunting due to the obscurity of legacy sources and the pressures of strict time constraints. To enhance the efficiency of on-call engineers (OCEs) and streamline their daily workflows, we introduced DECO-a comprehensive framework for developing, deploying, and managing enterprise-grade copilots tailored to improve productivity in engineering routines. This paper details the design and implementation of the DECO framework, emphasizing its innovative NL2SearchQuery functionality and a lightweight agentic framework. These features support efficient and customized retrieval-augmented-generation (RAG) algorithms that not only extract relevant information from diverse sources but also select the most pertinent skills in response to user queries. This enables the addressing of complex technical questions and provides seamless, automated access to internal resources. Additionally, DECO incorporates a robust mechanism for converting unstructured incident logs into user-friendly, structured guides, effectively bridging the documentation gap. Since its launch in September 2023, ENCO has demonstrated its effectiveness through widespread adoption, enabling tens of thousands of interactions and engaging hundreds of monthly active users (MAU) across dozens of organizations within the company.

Paper Structure

This paper contains 22 sections, 14 figures, 4 tables.

Figures (14)

  • Figure 1: Monitoring dashboard for ENCO usage
  • Figure 2: Input and output of IcM Processor
  • Figure 3: Frontend-backend interactions
  • Figure 4: Backend orchestration
  • Figure 5: Memory management
  • ...and 9 more figures