Table of Contents
Fetching ...

CODEOFCONDUCT at Multilingual Counterspeech Generation: A Context-Aware Model for Robust Counterspeech Generation in Low-Resource Languages

Michael Bennie, Bushi Xiao, Chryseis Xinyi Liu, Demi Zhang, Jian Meng, Alayo Tripp

TL;DR

This paper tackles automated counterspeech generation in multilingual, low-resource settings, addressing hate speech while preserving free expression. It introduces CODEOFCONDUCT, a three-stage framework that combines simulated annealing-based generation with JudgeLM-guided round-robin ranking to produce diverse, high-quality counterspeech across Basque, English, Italian, and Spanish. The approach achieves leading results in the MCG-COLING-2025 task, with Basque showing exceptional performance thanks to tailored vocabulary sampling and optimization, and offers insights into multilingual evaluation and resource-efficient generation. The work advances practical, context-aware counterspeech systems capable of operating with limited annotated data and in culturally diverse settings, contributing to safer online discourse.

Abstract

This paper introduces a context-aware model for robust counterspeech generation, which achieved significant success in the MCG-COLING-2025 shared task. Our approach particularly excelled in low-resource language settings. By leveraging a simulated annealing algorithm fine-tuned on multilingual datasets, the model generates factually accurate responses to hate speech. We demonstrate state-of-the-art performance across four languages (Basque, English, Italian, and Spanish), with our system ranking first for Basque, second for Italian, and third for both English and Spanish. Notably, our model swept all three top positions for Basque, highlighting its effectiveness in low-resource scenarios. Evaluation of the shared task employs both traditional metrics (BLEU, ROUGE, BERTScore, Novelty) and JudgeLM based on LLM. We present a detailed analysis of our results, including an empirical evaluation of the model performance and comprehensive score distributions across evaluation metrics. This work contributes to the growing body of research on multilingual counterspeech generation, offering insights into developing robust models that can adapt to diverse linguistic and cultural contexts in the fight against online hate speech.

CODEOFCONDUCT at Multilingual Counterspeech Generation: A Context-Aware Model for Robust Counterspeech Generation in Low-Resource Languages

TL;DR

This paper tackles automated counterspeech generation in multilingual, low-resource settings, addressing hate speech while preserving free expression. It introduces CODEOFCONDUCT, a three-stage framework that combines simulated annealing-based generation with JudgeLM-guided round-robin ranking to produce diverse, high-quality counterspeech across Basque, English, Italian, and Spanish. The approach achieves leading results in the MCG-COLING-2025 task, with Basque showing exceptional performance thanks to tailored vocabulary sampling and optimization, and offers insights into multilingual evaluation and resource-efficient generation. The work advances practical, context-aware counterspeech systems capable of operating with limited annotated data and in culturally diverse settings, contributing to safer online discourse.

Abstract

This paper introduces a context-aware model for robust counterspeech generation, which achieved significant success in the MCG-COLING-2025 shared task. Our approach particularly excelled in low-resource language settings. By leveraging a simulated annealing algorithm fine-tuned on multilingual datasets, the model generates factually accurate responses to hate speech. We demonstrate state-of-the-art performance across four languages (Basque, English, Italian, and Spanish), with our system ranking first for Basque, second for Italian, and third for both English and Spanish. Notably, our model swept all three top positions for Basque, highlighting its effectiveness in low-resource scenarios. Evaluation of the shared task employs both traditional metrics (BLEU, ROUGE, BERTScore, Novelty) and JudgeLM based on LLM. We present a detailed analysis of our results, including an empirical evaluation of the model performance and comprehensive score distributions across evaluation metrics. This work contributes to the growing body of research on multilingual counterspeech generation, offering insights into developing robust models that can adapt to diverse linguistic and cultural contexts in the fight against online hate speech.
Paper Structure (17 sections, 2 equations, 7 figures, 2 tables, 3 algorithms)

This paper contains 17 sections, 2 equations, 7 figures, 2 tables, 3 algorithms.

Figures (7)

  • Figure 1: A histogram of scores $E(c)$ for every counterspeech generated from Algorithm \ref{['alg:simulated-annealing']}
  • Figure 2: Box and whisker charts that compare the original scored value of a CS answer from stage 1 to the re-scored vales from stage 2.
  • Figure 3: Chart depicting the JudgeLM scores for each Basque run. Bars drawn in yellow represent the results from the CODEOFCONDUCT submission.
  • Figure 4: Chart depicting the ROUGE-L scores for each Basque run. Bars drawn in yellow represent the results from the CODEOFCONDUCT submission.
  • Figure 5: Chart depicting the BLEU scores for each Basque run. Bars drawn in yellow represent the results from the CODEOFCONDUCT submission.
  • ...and 2 more figures