Table of Contents
Fetching ...

Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation

Veronica Valeros, Anna Širokova, Carlos Catania, Sebastian Garcia

TL;DR

This paper tackles the challenge of translating Russian-language cybercrime chatter into English for timely cybersecurity insights, arguing that human translation is costly and machine translation often misses jargon and context. It proposes a fine-tuning workflow for a cloud LLM (GPT-3.5-turbo-0125) on a small, ground-truth dataset derived from the NoName057(16) hacktivist channel, including a structured fine-tuning prompt and vocabulary augmentation. The study combines human evaluation and automatic metrics (BLEU, METEOR, TER) to compare the fine-tuned model against baselines, finding that the fine-tuned model is generally preferred by human translators and yields improvements in several metrics, while also achieving substantial cost reductions relative to human translation. These results suggest that targeted fine-tuning can enable faster, cheaper, and more accurate cybercrime translations, facilitating real-time intelligence workflows, though challenges remain with platform restrictions and biases; future work emphasizes open-model fine-tuning and sharing to advance community collaboration.

Abstract

Understanding cybercrime communications is paramount for cybersecurity defence. This often involves translating communications into English for processing, interpreting, and generating timely intelligence. The problem is that translation is hard. Human translation is slow, expensive, and scarce. Machine translation is inaccurate and biased. We propose using fine-tuned Large Language Models (LLM) to generate translations that can accurately capture the nuances of cybercrime language. We apply our technique to public chats from the NoName057(16) Russian-speaking hacktivist group. Our results show that our fine-tuned LLM model is better, faster, more accurate, and able to capture nuances of the language. Our method shows it is possible to achieve high-fidelity translations and significantly reduce costs by a factor ranging from 430 to 23,000 compared to a human translator.

Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation

TL;DR

This paper tackles the challenge of translating Russian-language cybercrime chatter into English for timely cybersecurity insights, arguing that human translation is costly and machine translation often misses jargon and context. It proposes a fine-tuning workflow for a cloud LLM (GPT-3.5-turbo-0125) on a small, ground-truth dataset derived from the NoName057(16) hacktivist channel, including a structured fine-tuning prompt and vocabulary augmentation. The study combines human evaluation and automatic metrics (BLEU, METEOR, TER) to compare the fine-tuned model against baselines, finding that the fine-tuned model is generally preferred by human translators and yields improvements in several metrics, while also achieving substantial cost reductions relative to human translation. These results suggest that targeted fine-tuning can enable faster, cheaper, and more accurate cybercrime translations, facilitating real-time intelligence workflows, though challenges remain with platform restrictions and biases; future work emphasizes open-model fine-tuning and sharing to advance community collaboration.

Abstract

Understanding cybercrime communications is paramount for cybersecurity defence. This often involves translating communications into English for processing, interpreting, and generating timely intelligence. The problem is that translation is hard. Human translation is slow, expensive, and scarce. Machine translation is inaccurate and biased. We propose using fine-tuned Large Language Models (LLM) to generate translations that can accurately capture the nuances of cybercrime language. We apply our technique to public chats from the NoName057(16) Russian-speaking hacktivist group. Our results show that our fine-tuned LLM model is better, faster, more accurate, and able to capture nuances of the language. Our method shows it is possible to achieve high-fidelity translations and significantly reduce costs by a factor ranging from 430 to 23,000 compared to a human translator.
Paper Structure (15 sections, 4 figures, 2 tables)

This paper contains 15 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Public Telegram channel of the NoName057(16) hacktivist group, in Russian, where they showcase and publicise their activity.
  • Figure 2: The hacktivist group messages were processed first to compare existing translation methods and select the best one, using the output and the expert input to produce ground truth and fine-tune the selected LLM model, and finally, generating the data needed for the human and automatic evaluation.
  • Figure 3: The dataset used for fine-tuning has a JSONL format, where each line contains a message with three keys. Each key represents a role: system, user, and assistant.
  • Figure 4: A hacktivist message ground truth (black, top), alongside the translations by the base LLM model (green, middle) and fine-tuned model (blue, bottom). The three metrics chose the base-model translation when, as can be seen, the best translation is generated by the fine-tuned model.