Table of Contents
Fetching ...

Combating Toxic Language: A Review of LLM-Based Strategies for Software Engineering

Hao Zhuo, Yicheng Yang, Kewen Peng

TL;DR

This review addresses toxicity in software engineering by synthesizing 2020–2024 work on toxicity detection and mitigation with a focus on LLM-enabled approaches. It contrasts SE-specific datasets and general toxicity resources, examines annotation and pre-processing pipelines, and analyzes both traditional and transformer-based detection methods, including span-based localization. An ablation study demonstrates that LLM-based rewriting can effectively mitigate toxicity while preserving meaning, though domain-specific supervised models still outperform zero-shot/few-shot LLM detectors. The paper also outlines challenges such as dataset bias, language limitations, and the need for interpretable, policy-conscious deployment, offering a phased roadmap for future research. Overall, the findings support domain-specific tuning, context-aware detection, and explainable, ethically governed toxicity mitigation in SE environments.

Abstract

Large Language Models (LLMs) have become integral to Software Engineering (SE), increasingly used in development workflows. However, their widespread adoption raises concerns about the presence and propagation of toxic language - harmful or offensive content that can foster exclusionary environments. This paper provides a comprehensive review of recent research (2020-2024) on toxicity detection and mitigation, focusing on both SE-specific and general-purpose datasets. We examine annotation and pre-processing techniques, assess detection methodologies, and evaluate mitigation strategies, particularly those leveraging LLMs. Additionally, we conduct an ablation study demonstrating the effectiveness of LLM-based rewriting for reducing toxicity. This review is limited to studies published within the specified timeframe and within the domain of toxicity in LLMs and SE; therefore, certain emerging methods or datasets beyond this period may fall outside its purview. By synthesizing existing work and identifying open challenges, this review highlights key areas for future research to ensure the responsible deployment of LLMs in SE and beyond.

Combating Toxic Language: A Review of LLM-Based Strategies for Software Engineering

TL;DR

This review addresses toxicity in software engineering by synthesizing 2020–2024 work on toxicity detection and mitigation with a focus on LLM-enabled approaches. It contrasts SE-specific datasets and general toxicity resources, examines annotation and pre-processing pipelines, and analyzes both traditional and transformer-based detection methods, including span-based localization. An ablation study demonstrates that LLM-based rewriting can effectively mitigate toxicity while preserving meaning, though domain-specific supervised models still outperform zero-shot/few-shot LLM detectors. The paper also outlines challenges such as dataset bias, language limitations, and the need for interpretable, policy-conscious deployment, offering a phased roadmap for future research. Overall, the findings support domain-specific tuning, context-aware detection, and explainable, ethically governed toxicity mitigation in SE environments.

Abstract

Large Language Models (LLMs) have become integral to Software Engineering (SE), increasingly used in development workflows. However, their widespread adoption raises concerns about the presence and propagation of toxic language - harmful or offensive content that can foster exclusionary environments. This paper provides a comprehensive review of recent research (2020-2024) on toxicity detection and mitigation, focusing on both SE-specific and general-purpose datasets. We examine annotation and pre-processing techniques, assess detection methodologies, and evaluate mitigation strategies, particularly those leveraging LLMs. Additionally, we conduct an ablation study demonstrating the effectiveness of LLM-based rewriting for reducing toxicity. This review is limited to studies published within the specified timeframe and within the domain of toxicity in LLMs and SE; therefore, certain emerging methods or datasets beyond this period may fall outside its purview. By synthesizing existing work and identifying open challenges, this review highlights key areas for future research to ensure the responsible deployment of LLMs in SE and beyond.

Paper Structure

This paper contains 33 sections, 6 figures, 14 tables.

Figures (6)

  • Figure 1: Overall workflow of LLM toxicity detection and mitigation. Number annotations in each step represent relevant core literature as indexed in Table \ref{['tab:core-literature']}.
  • Figure 2: Study identification and selection process applied in this literature review
  • Figure 3: Code review dataset (comparing methods before & after retrained on SE Dataset)
  • Figure 4: Gitter Ethereum dataset (comparing methods before & after retrained on SE dataset)
  • Figure 5: Experimental workflow for toxicity detection and mitigation.
  • ...and 1 more figures