TnT-LLM: Text Mining at Scale with Large Language Models

Mengting Wan; Tara Safavi; Sujay Kumar Jauhar; Yujin Kim; Scott Counts; Jennifer Neville; Siddharth Suri; Chirag Shah; Ryen W White; Longqi Yang; Reid Andersen; Georg Buscher; Dhruv Joshi; Nagu Rangan

TnT-LLM: Text Mining at Scale with Large Language Models

Mengting Wan, Tara Safavi, Sujay Kumar Jauhar, Yujin Kim, Scott Counts, Jennifer Neville, Siddharth Suri, Chirag Shah, Ryen W White, Longqi Yang, Reid Andersen, Georg Buscher, Dhruv Joshi, Nagu Rangan

TL;DR

TnT-LLM is proposed, a two-phase framework that employs LLMs to automate the process of end-to-end label generation and assignment with minimal human effort for any given use-case, and generates more accurate and relevant label taxonomies when compared against state-of-the-art baselines.

Abstract

Transforming unstructured text into structured and meaningful forms, organized by useful category labels, is a fundamental step in text mining for downstream analysis and application. However, most existing methods for producing label taxonomies and building text-based label classifiers still rely heavily on domain expertise and manual curation, making the process expensive and time-consuming. This is particularly challenging when the label space is under-specified and large-scale data annotations are unavailable. In this paper, we address these challenges with Large Language Models (LLMs), whose prompt-based interface facilitates the induction and use of large-scale pseudo labels. We propose TnT-LLM, a two-phase framework that employs LLMs to automate the process of end-to-end label generation and assignment with minimal human effort for any given use-case. In the first phase, we introduce a zero-shot, multi-stage reasoning approach which enables LLMs to produce and refine a label taxonomy iteratively. In the second phase, LLMs are used as data labelers that yield training samples so that lightweight supervised classifiers can be reliably built, deployed, and served at scale. We apply TnT-LLM to the analysis of user intent and conversational domain for Bing Copilot (formerly Bing Chat), an open-domain chat-based search engine. Extensive experiments using both human and automatic evaluation metrics demonstrate that TnT-LLM generates more accurate and relevant label taxonomies when compared against state-of-the-art baselines, and achieves a favorable balance between accuracy and efficiency for classification at scale. We also share our practical experiences and insights on the challenges and opportunities of using LLMs for large-scale text mining in real-world applications.

TnT-LLM: Text Mining at Scale with Large Language Models

TL;DR

Abstract

Paper Structure (29 sections, 1 equation, 13 figures, 6 tables)

This paper contains 29 sections, 1 equation, 13 figures, 6 tables.

Introduction
Related Work
Method
Phase 1: Taxonomy Generation
Phase 2: LLM-Augmented Text Classification
Evaluation Suite
Phase 1 Evaluation Strategies
Phase 2 Evaluation Strategies
Experiments
Data
Taxonomy Generation
Methods
Implementation Details
Results
LLM-Augmented Text Classification
...and 14 more sections

Figures (13)

Figure 1: An illustration of the existing human-in-the-loop and unsupervised text clustering approaches as well as the proposed LLM-powered end-to-end label taxonomy generation and text classification framework (TnT-LLM).
Figure 2: An illustration of the LLM-powered taxonomy generation phase (Phase 1).
Figure 3: An illustration of the LLM-augmented text classification phase (Phase 2).
Figure 4: Taxonomy evaluation results on BingChat-Phase1-S-Eng from human raters and the GPT-4 rater, where error bars indicate 95% confidence intervals.
Figure 5: Taxonomy evaluation results by language on multilingual conversations (BingChat-Phase1-L-Multi) from the GPT-4 rater.
...and 8 more figures

TnT-LLM: Text Mining at Scale with Large Language Models

TL;DR

Abstract

TnT-LLM: Text Mining at Scale with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (13)