Table of Contents
Fetching ...

Adaptive Data Flywheel: Applying MAPE Control Loops to AI Agent Improvement

Aaditya Shukla, Sidney Knowles, Meenakshi Madugula, Dave Farris, Ryan Angilly, Santiago Pombo, Anbang Xu, Lu An, Abhinav Balasubramanian, Tan Yu, Jiaxiang Ren, Rama Akkiraju

TL;DR

This paper presents a practical framework for continuously improving enterprise AI agents by applying a MAP-K-based data flywheel to NVIDIA's NVInfo AI, a Mixture-of-Experts knowledge assistant serving over 30,000 employees. By monitoring user feedback and system performance, analyzing failure modes, planning targeted data curation and fine-tuning, and executing staged deployments, the authors demonstrate significant improvements: routing accuracy maintained at 96% with a 10x smaller model (8B vs 70B) and 70% latency reduction, plus a 3.7% gain in query rephrasal accuracy with 40% latency reduction. The work provides a repeatable blueprint leveraging NVIDIA NeMo microservices and LoRA-based PEFT to convert real-world usage into self-improving AI agents while addressing privacy, scalability, and deployment challenges. These results underscore the viability of closed-loop, feedback-driven learning in production enterprise GenAI, enabling faster, cost-effective improvements without full retraining. The framework offers practical guidance for building modular, robust, and privacy-conscious adaptive AI systems at scale.

Abstract

Enterprise AI agents must continuously adapt to maintain accuracy, reduce latency, and remain aligned with user needs. We present a practical implementation of a data flywheel in NVInfo AI, NVIDIA's Mixture-of-Experts (MoE) Knowledge Assistant serving over 30,000 employees. By operationalizing a MAPE-driven data flywheel, we built a closed-loop system that systematically addresses failures in retrieval-augmented generation (RAG) pipelines and enables continuous learning. Over a 3-month post-deployment period, we monitored feedback and collected 495 negative samples. Analysis revealed two major failure modes: routing errors (5.25\%) and query rephrasal errors (3.2\%). Using NVIDIA NeMo microservices, we implemented targeted improvements through fine-tuning. For routing, we replaced a Llama 3.1 70B model with a fine-tuned 8B variant, achieving 96\% accuracy, a 10x reduction in model size, and 70\% latency improvement. For query rephrasal, fine-tuning yielded a 3.7\% gain in accuracy and a 40\% latency reduction. Our approach demonstrates how human-in-the-loop (HITL) feedback, when structured within a data flywheel, transforms enterprise AI agents into self-improving systems. Key learnings include approaches to ensure agent robustness despite limited user feedback, navigating privacy constraints, and executing staged rollouts in production. This work offers a repeatable blueprint for building robust, adaptive enterprise AI agents capable of learning from real-world usage at scale.

Adaptive Data Flywheel: Applying MAPE Control Loops to AI Agent Improvement

TL;DR

This paper presents a practical framework for continuously improving enterprise AI agents by applying a MAP-K-based data flywheel to NVIDIA's NVInfo AI, a Mixture-of-Experts knowledge assistant serving over 30,000 employees. By monitoring user feedback and system performance, analyzing failure modes, planning targeted data curation and fine-tuning, and executing staged deployments, the authors demonstrate significant improvements: routing accuracy maintained at 96% with a 10x smaller model (8B vs 70B) and 70% latency reduction, plus a 3.7% gain in query rephrasal accuracy with 40% latency reduction. The work provides a repeatable blueprint leveraging NVIDIA NeMo microservices and LoRA-based PEFT to convert real-world usage into self-improving AI agents while addressing privacy, scalability, and deployment challenges. These results underscore the viability of closed-loop, feedback-driven learning in production enterprise GenAI, enabling faster, cost-effective improvements without full retraining. The framework offers practical guidance for building modular, robust, and privacy-conscious adaptive AI systems at scale.

Abstract

Enterprise AI agents must continuously adapt to maintain accuracy, reduce latency, and remain aligned with user needs. We present a practical implementation of a data flywheel in NVInfo AI, NVIDIA's Mixture-of-Experts (MoE) Knowledge Assistant serving over 30,000 employees. By operationalizing a MAPE-driven data flywheel, we built a closed-loop system that systematically addresses failures in retrieval-augmented generation (RAG) pipelines and enables continuous learning. Over a 3-month post-deployment period, we monitored feedback and collected 495 negative samples. Analysis revealed two major failure modes: routing errors (5.25\%) and query rephrasal errors (3.2\%). Using NVIDIA NeMo microservices, we implemented targeted improvements through fine-tuning. For routing, we replaced a Llama 3.1 70B model with a fine-tuned 8B variant, achieving 96\% accuracy, a 10x reduction in model size, and 70\% latency improvement. For query rephrasal, fine-tuning yielded a 3.7\% gain in accuracy and a 40\% latency reduction. Our approach demonstrates how human-in-the-loop (HITL) feedback, when structured within a data flywheel, transforms enterprise AI agents into self-improving systems. Key learnings include approaches to ensure agent robustness despite limited user feedback, navigating privacy constraints, and executing staged rollouts in production. This work offers a repeatable blueprint for building robust, adaptive enterprise AI agents capable of learning from real-world usage at scale.

Paper Structure

This paper contains 51 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Adaptive Data Flywheel Architecture showing the MAPE control loop implementation for AI agent improvement
  • Figure 2: NVInfo AI Mixture of Experts Architecture showing the complete RAG pipeline with Router, seven specialized domain experts, query rephrasing, retrieval, reranking, answer generation, and citation generation components
  • Figure 3: NVInfo AI Response and Feedback Capture Architecture showing the complete data collection, ingestion and transformation components
  • Figure 4: Sequential failure points in the RAG pipeline from query routing to answer generation
  • Figure 5: Representative NVInfo AI interface examples showing mixture-of-experts responses across different enterprise domains