Table of Contents
Fetching ...

QExplorer: Large Language Model Based Query Extraction for Toxic Content Exploration

Shaola Ren, Li Ke, Longtao Huang, Dehong Gao, Hui Xue

TL;DR

QExplorer tackles the challenge of discovering toxic content by extracting effective queries with a large language model. It leverages a two-stage alignment pipeline, consisting of instruction SFT and Direct Preference Optimization, augmented by feedback from a production search system. The authors construct long-context and context-clustered datasets from platform logs and demonstrate offline superiority to both humans and baselines, plus online gains in toxic-item detection. The work shows practical potential for scalable, data-driven safety tooling in real-world search systems.

Abstract

Automatically extracting effective queries is challenging in information retrieval, especially in toxic content exploration, as such content is likely to be disguised. With the recent achievements in generative Large Language Model (LLM), we are able to leverage the capabilities of LLMs to extract effective queries for similar content exploration directly. This study proposes QExplorer, an approach of large language model based Query Extraction for toxic content Exploration. The QExplorer approach involves a 2-stage training process: instruction Supervised FineTuning (SFT) and preference alignment using Direct Preference Optimization (DPO), as well as the datasets construction with feedback of search system. To verify the effectiveness of QExplorer, a series of offline and online experiments are conducted on our real-world system. The offline empirical results demonstrate that the performance of our automatic query extraction outperforms that of several LLMs and humans. The online deployment shows a significant increase in the detection of toxic items.

QExplorer: Large Language Model Based Query Extraction for Toxic Content Exploration

TL;DR

QExplorer tackles the challenge of discovering toxic content by extracting effective queries with a large language model. It leverages a two-stage alignment pipeline, consisting of instruction SFT and Direct Preference Optimization, augmented by feedback from a production search system. The authors construct long-context and context-clustered datasets from platform logs and demonstrate offline superiority to both humans and baselines, plus online gains in toxic-item detection. The work shows practical potential for scalable, data-driven safety tooling in real-world search systems.

Abstract

Automatically extracting effective queries is challenging in information retrieval, especially in toxic content exploration, as such content is likely to be disguised. With the recent achievements in generative Large Language Model (LLM), we are able to leverage the capabilities of LLMs to extract effective queries for similar content exploration directly. This study proposes QExplorer, an approach of large language model based Query Extraction for toxic content Exploration. The QExplorer approach involves a 2-stage training process: instruction Supervised FineTuning (SFT) and preference alignment using Direct Preference Optimization (DPO), as well as the datasets construction with feedback of search system. To verify the effectiveness of QExplorer, a series of offline and online experiments are conducted on our real-world system. The offline empirical results demonstrate that the performance of our automatic query extraction outperforms that of several LLMs and humans. The online deployment shows a significant increase in the detection of toxic items.

Paper Structure

This paper contains 24 sections, 6 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Framework of the toxic content detection system. This system is composed by strategies, classification models, a risk understanding module and a search system. Some suspicious toxic items missed by the strategies and classification models are reported by users of the trading platform. Then the auditors analyze the content and perform multi-modal query searches. The dashed box indicates the auditor risk understanding module, which is the focus of this study.
  • Figure 2: A diagram of QExplorer is presented. It contains three steps: (1) instruction SFT, (2) preference data construction, (3) preference alignment, which involves the well-finetuned LLM from step (1) using the preference data from step(2).
  • Figure 3: The diagram of the online implementation. The LLM in the shadowed part runs offline. Currently, this LLM can only process textual information. Therefore, the system still requires auditor to do risk understanding.
  • Figure 4: Online performance over time.