Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG

Weixi Lin

Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG

Weixi Lin

TL;DR

The results demonstrate that by optimizing the entire retrieval lifecycle - from pre-retrieval query rewriting to post-retrieval corrective evaluation - the Higress RAG system offers a scalable, hallucination-resistant solution for enterprise AI deployment.

Abstract

The integration of Large Language Models (LLMs) into enterprise knowledge management systems has been catalyzed by the Retrieval-Augmented Generation (RAG) paradigm, which augments parametric memory with non-parametric external data. However, the transition from proof-of-concept to production-grade RAG systems is hindered by three persistent challenges: low retrieval precision for complex queries, high rates of hallucination in the generation phase, and unacceptable latency for real-time applications. This paper presents a comprehensive analysis of the Higress RAG MCP Server, a novel, enterprise-centric architecture designed to resolve these bottlenecks through a "Full-Link Optimization" strategy. Built upon the Model Context Protocol (MCP), the system introduces a layered architecture that orchestrates a sophisticated pipeline of Adaptive Routing, Semantic Caching, Hybrid Retrieval, and Corrective RAG (CRAG). We detail the technical implementation of key innovations, including the Higress-Native Splitter for structure-aware data ingestion, the application of Reciprocal Rank Fusion (RRF) for merging dense and sparse retrieval signals, and a 50ms-latency Semantic Caching mechanism with dynamic thresholding. Experimental evaluations on domain-specific Higress technical documentation and blogs verify the system's architectural robustness. The results demonstrate that by optimizing the entire retrieval lifecycle - from pre-retrieval query rewriting to post-retrieval corrective evaluation - the Higress RAG system offers a scalable, hallucination-resistant solution for enterprise AI deployment.

Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG

TL;DR

Abstract

Paper Structure (35 sections, 2 equations, 5 figures, 2 tables)

This paper contains 35 sections, 2 equations, 5 figures, 2 tables.

Introduction
The Imperative of Retrieval-Augmented Generation
Limitations of Standard RAG Architectures
The Higress Solution: Full-Link Optimization
Related Work and Theoretical Foundations
Evolution of RAG Systems
The Model Context Protocol (MCP)
Corrective Retrieval Augmented Generation (CRAG)
Hypothetical Document Embeddings (HyDE)
System Architecture
Layer 1: The MCP Server Layer
Layer 2: The RAG Client Layer
Layer 3: Enhanced RAG Components Layer
Layer 4: Infrastructure Layer
Key Technologies and Methodologies
...and 20 more sections

Figures (5)

Figure 1: The Layered Architecture of Higress-RAG MCP Server. It illustrates the separation of concerns between the MCP Server Layer, RAG Client Layer, and Enhanced Components.
Figure 2: The Holistic Optimization Pipeline. The process flows dynamically from Semantic Caching to Adaptive Routing, followed by Hybrid Retrieval and Corrective Evaluation.
Figure 3: The Corrective RAG (CRAG) Decision Logic. The evaluator classifies retrieved context confidence to trigger refinement, web search, or direct generation.
Figure 4: Performance Comparison on Higress Blog Datasets. The optimized system achieved >90% recall, significantly outperforming the naive baseline in retrieving relevant community knowledge.
Figure 5: Performance Comparison on Higress Official Docs Datasets. The integration of Boost Ranker effectively filtered outdated versions, boosting the factuality score for configuration-centric queries.

Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG

TL;DR

Abstract

Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG

Authors

TL;DR

Abstract

Table of Contents

Figures (5)