CHOPS: CHat with custOmer Profile Systems for Customer Service with LLMs
Jingzhe Shi, Jialuo Li, Qinwei Ma, Zaiwen Yang, Huan Ma, Lei Li
TL;DR
This work tackles safe, cost-conscious customer service with LLMs by integrating user profiles and APIs through a classifier-executor-verifier CHOPS framework. It introduces the CPHOS-dataset, consisting of a database, PDF-based guides, and QA pairs derived from real-world CPHS interactions, to enable evaluation of LLM-based customer-service workflows. Experiments show that CHOPS achieves high accuracy (up to $>98\%$ on key metrics) with favorable cost compared to end-to-end LLM baselines, especially when using a 2-level classifier and a verifier; mixing LLM backbones (weaker classifiers/verifiers with a stronger executor) yields practical performance-cost trade-offs. The approach provides a scalable path to deploying LLM-driven customer service within existing systems, with robust safeguards and resource efficiency, and points to broader applicability beyond the Olympiad domain through expanded datasets and tools.
Abstract
Businesses and software platforms are increasingly turning to Large Language Models (LLMs) such as GPT-3.5, GPT-4, GLM-3, and LLaMa-2 for chat assistance with file access or as reasoning agents for customer service. However, current LLM-based customer service models have limited integration with customer profiles and lack the operational capabilities necessary for effective service. Moreover, existing API integrations emphasize diversity over the precision and error avoidance essential in real-world customer service scenarios. To address these issues, we propose an LLM agent named CHOPS (CHat with custOmer Profile in existing System), designed to: (1) efficiently utilize existing databases or systems for accessing user information or interacting with these systems following existing guidelines; (2) provide accurate and reasonable responses or carry out required operations in the system while avoiding harmful operations; and (3) leverage a combination of small and large LLMs to achieve satisfying performance at a reasonable inference cost. We introduce a practical dataset, the CPHOS-dataset, which includes a database, guiding files, and QA pairs collected from CPHOS, an online platform that facilitates the organization of simulated Physics Olympiads for high school teachers and students. We have conducted extensive experiments to validate the performance of our proposed CHOPS architecture using the CPHOS-dataset, with the aim of demonstrating how LLMs can enhance or serve as alternatives to human customer service. Code for our proposed architecture and dataset can be found at {https://github.com/JingzheShi/CHOPS}.
