Table of Contents
Fetching ...

SmurfCat at PAN 2024 TextDetox: Alignment of Multilingual Transformers for Text Detoxification

Elisei Rykov, Konstantin Zaytsev, Ivan Anisimov, Alexandr Voronin

TL;DR

The paper tackles multilingual text detoxification across nine languages by augmenting training data through English-to-target translations and rigorous filtering to preserve meaning and control toxicity. It fine-tunes multilingual seq2seq transformers (notably mT0-XL and Aya-101) and applies diverse beam search alongside ORPO alignment to refine outputs, achieving top automatic results and strong human performance, especially for Ukrainian, with a 3.7B-parameter model. The approach demonstrates effective data augmentation and alignment strategies to boost detoxification in low-resource languages and highlights directions for cross-language transfer and interpretability. Overall, the work offers a practical, scalable pipeline for multilingual detoxification with competitive results in both automated and human evaluations.

Abstract

This paper presents a solution for the Multilingual Text Detoxification task in the PAN-2024 competition of the SmurfCat team. Using data augmentation through machine translation and a special filtering procedure, we collected an additional multilingual parallel dataset for text detoxification. Using the obtained data, we fine-tuned several multilingual sequence-to-sequence models, such as mT0 and Aya, on a text detoxification task. We applied the ORPO alignment technique to the final model. Our final model has only 3.7 billion parameters and achieves state-of-the-art results for the Ukrainian language and near state-of-the-art results for other languages. In the competition, our team achieved first place in the automated evaluation with a score of 0.52 and second place in the final human evaluation with a score of 0.74.

SmurfCat at PAN 2024 TextDetox: Alignment of Multilingual Transformers for Text Detoxification

TL;DR

The paper tackles multilingual text detoxification across nine languages by augmenting training data through English-to-target translations and rigorous filtering to preserve meaning and control toxicity. It fine-tunes multilingual seq2seq transformers (notably mT0-XL and Aya-101) and applies diverse beam search alongside ORPO alignment to refine outputs, achieving top automatic results and strong human performance, especially for Ukrainian, with a 3.7B-parameter model. The approach demonstrates effective data augmentation and alignment strategies to boost detoxification in low-resource languages and highlights directions for cross-language transfer and interpretability. Overall, the work offers a practical, scalable pipeline for multilingual detoxification with competitive results in both automated and human evaluations.

Abstract

This paper presents a solution for the Multilingual Text Detoxification task in the PAN-2024 competition of the SmurfCat team. Using data augmentation through machine translation and a special filtering procedure, we collected an additional multilingual parallel dataset for text detoxification. Using the obtained data, we fine-tuned several multilingual sequence-to-sequence models, such as mT0 and Aya, on a text detoxification task. We applied the ORPO alignment technique to the final model. Our final model has only 3.7 billion parameters and achieves state-of-the-art results for the Ukrainian language and near state-of-the-art results for other languages. In the competition, our team achieved first place in the automated evaluation with a score of 0.52 and second place in the final human evaluation with a score of 0.74.
Paper Structure (8 sections, 3 figures, 5 tables)

This paper contains 8 sections, 3 figures, 5 tables.

Figures (3)

  • Figure 1: An overview of our approach. We used different datasets, fine-tuned the whole mT0-XL model and finally performed the ORPO alignment step.
  • Figure 2: Toxicity of translations
  • Figure 3: Similarity of translations