Table of Contents
Fetching ...

Exploring the Capabilities of ChatGPT in Ancient Chinese Translation and Person Name Recognition

Shijing Si, Siqing Zhou, Le Tang, Xiaoqing Cheng, Yugui Zhang

TL;DR

This work systematically evaluates ChatGPT on ancient Chinese tasks using Shi Shuo Xin Yu, focusing on ancient-to-modern translation and person-name recognition. It demonstrates that ChatGPT's translation quality peaks with three consecutive sentences but remains limited overall, while named-entity recognition benefits from prompting and demonstrations, outperforming a Jieba baseline yet still leaving room for improvement. The study provides reproducible data handling, prompts, and code to facilitate further research, and discusses causes for limited performance along with avenues for enhancement through richer ancient-Chinese corpora and domain knowledge. Overall, the paper contributes empirical benchmarks, methodological insights, and practical guidance for applying LLMs to ancient Chinese processing and cultural heritage tasks.

Abstract

ChatGPT's proficiency in handling modern standard languages suggests potential for its use in understanding ancient Chinese. This paper explores ChatGPT's capabilities on ancient Chinese via two tasks: translating ancient Chinese to modern Chinese and recognizing ancient Chinese names. A comparison of ChatGPT's output with human translations serves to evaluate its comprehension of ancient Chinese. The findings indicate that: (1.)the proficiency of ancient Chinese by ChatGPT is yet to reach a satisfactory level; (2.) ChatGPT performs the best on ancient-to-modern translation when feeding with three context sentences. To help reproduce our work, we display the python code snippets used in this study.

Exploring the Capabilities of ChatGPT in Ancient Chinese Translation and Person Name Recognition

TL;DR

This work systematically evaluates ChatGPT on ancient Chinese tasks using Shi Shuo Xin Yu, focusing on ancient-to-modern translation and person-name recognition. It demonstrates that ChatGPT's translation quality peaks with three consecutive sentences but remains limited overall, while named-entity recognition benefits from prompting and demonstrations, outperforming a Jieba baseline yet still leaving room for improvement. The study provides reproducible data handling, prompts, and code to facilitate further research, and discusses causes for limited performance along with avenues for enhancement through richer ancient-Chinese corpora and domain knowledge. Overall, the paper contributes empirical benchmarks, methodological insights, and practical guidance for applying LLMs to ancient Chinese processing and cultural heritage tasks.

Abstract

ChatGPT's proficiency in handling modern standard languages suggests potential for its use in understanding ancient Chinese. This paper explores ChatGPT's capabilities on ancient Chinese via two tasks: translating ancient Chinese to modern Chinese and recognizing ancient Chinese names. A comparison of ChatGPT's output with human translations serves to evaluate its comprehension of ancient Chinese. The findings indicate that: (1.)the proficiency of ancient Chinese by ChatGPT is yet to reach a satisfactory level; (2.) ChatGPT performs the best on ancient-to-modern translation when feeding with three context sentences. To help reproduce our work, we display the python code snippets used in this study.
Paper Structure (15 sections, 4 figures, 5 tables)

This paper contains 15 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The code snippet for ancient-to-modern Chinese translation. The input of the chat function is the ancient Chinese sentence, and the output is its corresponding translation.
  • Figure 2: The code snippet for people name recognition from ancient Chinese text. The input of the function is the ancient Chinese sentence, and the output is people name with analysis.
  • Figure 3: The BERT-scores of ancient-to-modern Chinese translation when ChatGPT is fed with 1, 3, 5, and 8 consecutive sentences.
  • Figure 4: The performance of ChatGPT on person name recognition versus the number of examples in the prompt.