Teacher-Student Learning on Complexity in Intelligent Routing

Shu-Ting Pi; Michael Yang; Yuying Zhu; Qun Liu

Teacher-Student Learning on Complexity in Intelligent Routing

Shu-Ting Pi, Michael Yang, Yuying Zhu, Qun Liu

TL;DR

To tackle efficient routing, the paper introduces a teacher-student framework where post-contact transcripts label complexity and a pre-contact predictor guides routing. The teacher uses three metrics $L$ (length), $H$ (uncertainty/entropy), and $S$ (skillfulness) to produce a complexity score $Q$ via $Q = T^U_G( w * (L^N+H^N+S^N) )$ with $w=2$ and a subsequent quantile transform to $[0,1]$. A student model trained on over 100 pre-contact features predicts high complexity before interaction, with entity embeddings improving recall from $0.05$ to $0.28$ and precision from $0.54$ to $0.56$. The paper also introduces Complexity AUC, derived from dual transformations of complexity distributions, as a statistical measure of routing effectiveness across groups. Experiments report reductions in transfers (~53%), multi-transfers (>$95%$), and handle time (~13%), demonstrating practical impact and guiding bottleneck analysis.

Abstract

Customer service is often the most time-consuming aspect for e-commerce websites, with each contact typically taking 10-15 minutes. Effectively routing customers to appropriate agents without transfers is therefore crucial for e-commerce success. To this end, we have developed a machine learning framework that predicts the complexity of customer contacts and routes them to appropriate agents accordingly. The framework consists of two parts. First, we train a teacher model to score the complexity of a contact based on the post-contact transcripts. Then, we use the teacher model as a data annotator to provide labels to train a student model that predicts the complexity based on pre-contact data only. Our experiments show that such a framework is successful and can significantly improve customer experience. We also propose a useful metric called complexity AUC that evaluates the effectiveness of customer service at a statistical level.

Teacher-Student Learning on Complexity in Intelligent Routing

TL;DR

(length),

(uncertainty/entropy), and

(skillfulness) to produce a complexity score

via

with

and a subsequent quantile transform to

. A student model trained on over 100 pre-contact features predicts high complexity before interaction, with entity embeddings improving recall from

and precision from

. The paper also introduces Complexity AUC, derived from dual transformations of complexity distributions, as a statistical measure of routing effectiveness across groups. Experiments report reductions in transfers (~53%), multi-transfers (>

), and handle time (~13%), demonstrating practical impact and guiding bottleneck analysis.

Abstract

Paper Structure (16 sections, 4 equations, 5 figures)

This paper contains 16 sections, 4 equations, 5 figures.

Introduction
Teacher Model
Hypothesis of Complexity
The Dataset
Complexity Score
Validation
Student Model
Model Training
Experimental Results
Distributions of Complexity Score
Routing Metrics
Complexity AUC
Dual Transformation
Area Under Curve
Experimental Results
...and 1 more sections

Figures (5)

Figure 1: (a) We demonstrate the KL divergence boosting function with 60 trees for two different examples. The function decays faster for example 1 than for example 2, resulting in a smaller integral or skillfulness. (b)-(d) The skillfulness, entropy, and sentence length (all normalized to 1) distributions of 500K contacts are shown. Higher complexity contacts are represented by the red regions, while lower complexity contacts are represented by the green regions.
Figure 2: ((a)-(c) We present the distribution of the sum of the three complexity measures using various weights on length. (d) We convert the distribution from (c) into a uniform distribution using quantile transformation. (e) The probability of finding high/medium/low complexity in different scores is shown. Over 400 ground truth labels were generated via senior agents. The dash lines represent true numbers, while the solid lines are fitted curves using polynomial functions. The blue line represents low complexity, the red line represents medium complexity, and the green line represents high complexity.
Figure 3: (a)-(c) We show the distribution of complexity scores for the background group, control group, and treatment group, respectively. (d)-(f) The dual transformation curve is displayed for the (background, control), (background, treatment), and (control, treatment) pairs used as the benchmark and target groups.
Figure 5.1: Below is a synthetic example conversation transcript created by a researcher and not sourced from real interactions. The highlighted sentence represents the primary issue raised by the customer. This transcript has been labeled with "Music Related Service" as a unique identifier based on the primary question.
Figure 5.2: The Complexity AUC is displayed for various product lines or services. The red dashed line represents an AUC of 0.5. Values above the dashed line indicate less effective in serving customers than the average.

Teacher-Student Learning on Complexity in Intelligent Routing

TL;DR

Abstract

Teacher-Student Learning on Complexity in Intelligent Routing

Authors

TL;DR

Abstract

Table of Contents

Figures (5)