Table of Contents
Fetching ...

Word Order in English-Japanese Simultaneous Interpretation: Analyses and Evaluation using Chunk-wise Monotonic Translation

Kosuke Doi, Yuka Ko, Mana Makinae, Katsuhito Sudoh, Satoshi Nakamura

TL;DR

The paper addresses the challenge of preserving word order in English–Japanese simultaneous interpretation by analyzing chunk-wise monotonic translation (CMT) using the NAIST dataset. It provides qualitative and quantitative analyses of CMT sentences, including an annotation scheme and comparisons with offline and SI data to identify factors that complicate monotonic translations. The authors evaluate ST and simulST models on CMT and SI/offline references, finding that SI-based test sets may underestimate model performance and that CMT references favor simulST models trained with SI data, though COMET results depend on the reference. These findings offer guidance for segmentation and decoding policy design in SI systems and highlight the importance of choosing evaluation references aligned with human interpretation strategies.

Abstract

This paper analyzes the features of monotonic translations, which follow the word order of the source language, in simultaneous interpreting (SI). Word order differences are one of the biggest challenges in SI, especially for language pairs with significant structural differences like English and Japanese. We analyzed the characteristics of chunk-wise monotonic translation (CMT) sentences using the NAIST English-to-Japanese Chunk-wise Monotonic Translation Evaluation Dataset and identified some grammatical structures that make monotonic translation difficult in English-Japanese SI. We further investigated the features of CMT sentences by evaluating the output from the existing speech translation (ST) and simultaneous speech translation (simulST) models on the NAIST English-to-Japanese Chunk-wise Monotonic Translation Evaluation Dataset as well as on existing test sets. The results indicate the possibility that the existing SI-based test set underestimates the model performance. The results also suggest that using CMT sentences as references gives higher scores to simulST models than ST models, and that using an offline-based test set to evaluate the simulST models underestimates the model performance.

Word Order in English-Japanese Simultaneous Interpretation: Analyses and Evaluation using Chunk-wise Monotonic Translation

TL;DR

The paper addresses the challenge of preserving word order in English–Japanese simultaneous interpretation by analyzing chunk-wise monotonic translation (CMT) using the NAIST dataset. It provides qualitative and quantitative analyses of CMT sentences, including an annotation scheme and comparisons with offline and SI data to identify factors that complicate monotonic translations. The authors evaluate ST and simulST models on CMT and SI/offline references, finding that SI-based test sets may underestimate model performance and that CMT references favor simulST models trained with SI data, though COMET results depend on the reference. These findings offer guidance for segmentation and decoding policy design in SI systems and highlight the importance of choosing evaluation references aligned with human interpretation strategies.

Abstract

This paper analyzes the features of monotonic translations, which follow the word order of the source language, in simultaneous interpreting (SI). Word order differences are one of the biggest challenges in SI, especially for language pairs with significant structural differences like English and Japanese. We analyzed the characteristics of chunk-wise monotonic translation (CMT) sentences using the NAIST English-to-Japanese Chunk-wise Monotonic Translation Evaluation Dataset and identified some grammatical structures that make monotonic translation difficult in English-Japanese SI. We further investigated the features of CMT sentences by evaluating the output from the existing speech translation (ST) and simultaneous speech translation (simulST) models on the NAIST English-to-Japanese Chunk-wise Monotonic Translation Evaluation Dataset as well as on existing test sets. The results indicate the possibility that the existing SI-based test set underestimates the model performance. The results also suggest that using CMT sentences as references gives higher scores to simulST models than ST models, and that using an offline-based test set to evaluate the simulST models underestimates the model performance.
Paper Structure (19 sections, 1 figure, 8 tables)

This paper contains 19 sections, 1 figure, 8 tables.

Figures (1)

  • Figure 1: Annotation examples. Repeat tags were assigned even if strings did not exactly match but referred to same entity or had same meaning.