Table of Contents
Fetching ...

From Scratch to Fine-Tuned: A Comparative Study of Transformer Training Strategies for Legal Machine Translation

Amit Barman, Atanu Mandal, Sudip Kumar Naskar

TL;DR

The paper compares two Transformer-based strategies for English–Hindi legal translation within the JUST-NLP 2025 shared task: fine-tuning a pre-trained OPUS-MT model for the legal domain and training a Transformer from scratch on a legal corpus. It demonstrates that domain adaptation yields substantial gains, with the fine-tuned OPUS-MT achieving the highest scores across multiple metrics (e.g., SacreBLEU 46.03, BERTScore 91.19, COMET 73.72) and outpacing both the baseline and from-scratch approaches. The work highlights the importance of domain-specific adaptation for high-stakes legal translation and discusses limitations such as dataset scope and evaluation breadth. It also outlines future directions, including extending to additional Indian languages and employing parameter-efficient tuning to scale Legal MT in multilingual justice contexts.

Abstract

In multilingual nations like India, access to legal information is often hindered by language barriers, as much of the legal and judicial documentation remains in English. Legal Machine Translation (L-MT) offers a scalable solution to this challenge by enabling accurate and accessible translations of legal documents. This paper presents our work for the JUST-NLP 2025 Legal MT shared task, focusing on English-Hindi translation using Transformer-based approaches. We experiment with 2 complementary strategies, fine-tuning a pre-trained OPUS-MT model for domain-specific adaptation and training a Transformer model from scratch using the provided legal corpus. Performance is evaluated using standard MT metrics, including SacreBLEU, chrF++, TER, ROUGE, BERTScore, METEOR, and COMET. Our fine-tuned OPUS-MT model achieves a SacreBLEU score of 46.03, significantly outperforming both baseline and from-scratch models. The results highlight the effectiveness of domain adaptation in enhancing translation quality and demonstrate the potential of L-MT systems to improve access to justice and legal transparency in multilingual contexts.

From Scratch to Fine-Tuned: A Comparative Study of Transformer Training Strategies for Legal Machine Translation

TL;DR

The paper compares two Transformer-based strategies for English–Hindi legal translation within the JUST-NLP 2025 shared task: fine-tuning a pre-trained OPUS-MT model for the legal domain and training a Transformer from scratch on a legal corpus. It demonstrates that domain adaptation yields substantial gains, with the fine-tuned OPUS-MT achieving the highest scores across multiple metrics (e.g., SacreBLEU 46.03, BERTScore 91.19, COMET 73.72) and outpacing both the baseline and from-scratch approaches. The work highlights the importance of domain-specific adaptation for high-stakes legal translation and discusses limitations such as dataset scope and evaluation breadth. It also outlines future directions, including extending to additional Indian languages and employing parameter-efficient tuning to scale Legal MT in multilingual justice contexts.

Abstract

In multilingual nations like India, access to legal information is often hindered by language barriers, as much of the legal and judicial documentation remains in English. Legal Machine Translation (L-MT) offers a scalable solution to this challenge by enabling accurate and accessible translations of legal documents. This paper presents our work for the JUST-NLP 2025 Legal MT shared task, focusing on English-Hindi translation using Transformer-based approaches. We experiment with 2 complementary strategies, fine-tuning a pre-trained OPUS-MT model for domain-specific adaptation and training a Transformer model from scratch using the provided legal corpus. Performance is evaluated using standard MT metrics, including SacreBLEU, chrF++, TER, ROUGE, BERTScore, METEOR, and COMET. Our fine-tuned OPUS-MT model achieves a SacreBLEU score of 46.03, significantly outperforming both baseline and from-scratch models. The results highlight the effectiveness of domain adaptation in enhancing translation quality and demonstrate the potential of L-MT systems to improve access to justice and legal transparency in multilingual contexts.

Paper Structure

This paper contains 9 sections, 3 tables.