Table of Contents
Fetching ...

REIC: RAG-Enhanced Intent Classification at Scale

Ziji Zhang, Michael Yang, Zhiyu Chen, Yingying Zhuang, Shu-Ting Pi, Qun Liu, Rajashekar Maragoud, Vy Nguyen, Anurag Beniwal

TL;DR

REIC addresses the scalability and taxonomy heterogeneity of large-scale customer-service intents by integrating Retrieval-augmented generation with a hierarchical intent ontology. It constructs a dense index of (query,intent) pairs, retrieves a small set of candidates for each query, and uses a fine-tuned LLM with constrained decoding to compute P(t|q,E), enabling dynamic updates without retraining. Empirical results show REIC outperforms traditional fine-tuning and prompting baselines in in-domain settings, with strong robustness to unseen intents and favorable online deployment impact. The approach offers industry-ready benefits by reducing training needs, improving routing accuracy, and enabling rapid adaptation to evolving intent taxonomies.

Abstract

Accurate intent classification is critical for efficient routing in customer service, ensuring customers are connected with the most suitable agents while reducing handling times and operational costs. However, as companies expand their product lines, intent classification faces scalability challenges due to the increasing number of intents and variations in taxonomy across different verticals. In this paper, we introduce REIC, a Retrieval-augmented generation Enhanced Intent Classification approach, which addresses these challenges effectively. REIC leverages retrieval-augmented generation (RAG) to dynamically incorporate relevant knowledge, enabling precise classification without the need for frequent retraining. Through extensive experiments on real-world datasets, we demonstrate that REIC outperforms traditional fine-tuning, zero-shot, and few-shot methods in large-scale customer service settings. Our results highlight its effectiveness in both in-domain and out-of-domain scenarios, demonstrating its potential for real-world deployment in adaptive and large-scale intent classification systems.

REIC: RAG-Enhanced Intent Classification at Scale

TL;DR

REIC addresses the scalability and taxonomy heterogeneity of large-scale customer-service intents by integrating Retrieval-augmented generation with a hierarchical intent ontology. It constructs a dense index of (query,intent) pairs, retrieves a small set of candidates for each query, and uses a fine-tuned LLM with constrained decoding to compute P(t|q,E), enabling dynamic updates without retraining. Empirical results show REIC outperforms traditional fine-tuning and prompting baselines in in-domain settings, with strong robustness to unseen intents and favorable online deployment impact. The approach offers industry-ready benefits by reducing training needs, improving routing accuracy, and enabling rapid adaptation to evolving intent taxonomies.

Abstract

Accurate intent classification is critical for efficient routing in customer service, ensuring customers are connected with the most suitable agents while reducing handling times and operational costs. However, as companies expand their product lines, intent classification faces scalability challenges due to the increasing number of intents and variations in taxonomy across different verticals. In this paper, we introduce REIC, a Retrieval-augmented generation Enhanced Intent Classification approach, which addresses these challenges effectively. REIC leverages retrieval-augmented generation (RAG) to dynamically incorporate relevant knowledge, enabling precise classification without the need for frequent retraining. Through extensive experiments on real-world datasets, we demonstrate that REIC outperforms traditional fine-tuning, zero-shot, and few-shot methods in large-scale customer service settings. Our results highlight its effectiveness in both in-domain and out-of-domain scenarios, demonstrating its potential for real-world deployment in adaptive and large-scale intent classification systems.

Paper Structure

This paper contains 21 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: We present the heterogeneous intent structure with representative examples, illustrating the intent label hierarchy in each vertical.
  • Figure 2: The proposed REIC method from customer query to routing intent leveraging on vector retrieval and probability calculation.
  • Figure 3: Constrained decoding for probability calculation.
  • Figure 4: The accuracy and latency when using different retrieval top-k values.