Table of Contents
Fetching ...

On Creating an English-Thai Code-switched Machine Translation in Medical Domain

Parinthapat Pengpun, Krittamate Tiankanon, Amrest Chinkamol, Jiramet Kinchagawat, Pitchaya Chairuengjitjaras, Pasit Supholkhan, Pubordee Aussavavirojekul, Chiraphat Boonnag, Kanyakorn Veerakanjana, Hirunkul Phimsiri, Boonthicha Sae-jia, Nattawach Sataudom, Piyalitt Ittichaiwong, Peerat Limkonchotiwat

TL;DR

This work tackles the challenge of English–Thai medical machine translation by adopting a code-switching strategy that preserves critical medical terms in English. It introduces a pipeline to create a large English–Thai CS benchmark via keyword masking, expanding training data with back-translation and COMET-based filtering, and fine-tunes NLLB models on this data. Comprehensive evaluation combines automated MT metrics with MD-focused human judgments, revealing that MDs strongly prefer CS translations for factual accuracy even when fluency declines, and exposing misalignments between standard metrics and medical quality. The study provides a publicly available dataset and open-source models, underscoring the practical importance of terminology preservation for real-world medical translation tasks.

Abstract

Machine translation (MT) in the medical domain plays a pivotal role in enhancing healthcare quality and disseminating medical knowledge. Despite advancements in English-Thai MT technology, common MT approaches often underperform in the medical field due to their inability to precisely translate medical terminologies. Our research prioritizes not merely improving translation accuracy but also maintaining medical terminology in English within the translated text through code-switched (CS) translation. We developed a method to produce CS medical translation data, fine-tuned a CS translation model with this data, and evaluated its performance against strong baselines, such as Google Neural Machine Translation (NMT) and GPT-3.5/GPT-4. Our model demonstrated competitive performance in automatic metrics and was highly favored in human preference evaluations. Our evaluation result also shows that medical professionals significantly prefer CS translations that maintain critical English terms accurately, even if it slightly compromises fluency. Our code and test set are publicly available https://github.com/preceptorai-org/NLLB_CS_EM_NLP2024.

On Creating an English-Thai Code-switched Machine Translation in Medical Domain

TL;DR

This work tackles the challenge of English–Thai medical machine translation by adopting a code-switching strategy that preserves critical medical terms in English. It introduces a pipeline to create a large English–Thai CS benchmark via keyword masking, expanding training data with back-translation and COMET-based filtering, and fine-tunes NLLB models on this data. Comprehensive evaluation combines automated MT metrics with MD-focused human judgments, revealing that MDs strongly prefer CS translations for factual accuracy even when fluency declines, and exposing misalignments between standard metrics and medical quality. The study provides a publicly available dataset and open-source models, underscoring the practical importance of terminology preservation for real-world medical translation tasks.

Abstract

Machine translation (MT) in the medical domain plays a pivotal role in enhancing healthcare quality and disseminating medical knowledge. Despite advancements in English-Thai MT technology, common MT approaches often underperform in the medical field due to their inability to precisely translate medical terminologies. Our research prioritizes not merely improving translation accuracy but also maintaining medical terminology in English within the translated text through code-switched (CS) translation. We developed a method to produce CS medical translation data, fine-tuned a CS translation model with this data, and evaluated its performance against strong baselines, such as Google Neural Machine Translation (NMT) and GPT-3.5/GPT-4. Our model demonstrated competitive performance in automatic metrics and was highly favored in human preference evaluations. Our evaluation result also shows that medical professionals significantly prefer CS translations that maintain critical English terms accurately, even if it slightly compromises fluency. Our code and test set are publicly available https://github.com/preceptorai-org/NLLB_CS_EM_NLP2024.

Paper Structure

This paper contains 26 sections, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Example of how Google NMT alters the meaning of the sentence when translating from the source language (English) to the target language (Thai) and how it compares with the translations that Medical Doctors (MDs) prefer. "Hyperinflation" (abnormal increase of lung volume) is translated into Hyperinflation in economic context; "air trapping" (retention of air in the lungs distal to an obstruction) is translated into "air quarantine".
  • Figure 2: Example Questionnaire User Interface
  • Figure 3: Real samples where our internal MDs and external MDs both report a preference for NLLB-1 CS translation over Google NMT. Red sections indicate medical keywords that Google NMT does not translate precisely. Orange sections indicate medical keywords that Google NMT translates precisely, but retaining them in English is still preferred. Blue sections indicate medical keywords that are retained in English and convey their meaning precisely.
  • Figure 4: Instruction text for human annotators
  • Figure 5: Plots of factual score of each model that pass 3 factual accuracy score against machine evaluation metric.Masked model are labeled in blue and models without masked are labeled in orange.
  • ...and 3 more figures