Table of Contents
Fetching ...

Restoration-Guided Kuzushiji Character Recognition Framework under Seal Interference

Rui-Yang Ju, Kohei Yamashita, Hirotaka Kameko, Shinsuke Mori

TL;DR

This work tackles Kuzushiji recognition when red seals overlap characters, which degrades accuracy. It introduces RG-KCR, a three-stage framework that first detects characters, then applies a training-free restoration to remove seal artifacts before classifying individual characters with a ViT-based Metom model, and finally overlays the recognized modern Japanese characters onto restored documents for readable output. Key contributions include a training-free color-based seal removal method, construction of a 1,000-image detection dataset with synthetic seals, and an ablation showing restoration boosts Metom Top-1 accuracy from 93.45% to 95.33% (with a modest 0.51 s per image overhead), along with strong detection performance from YOLOv12-medium (AP50:97.0, precision 98.0%, recall 93.9%). The approach yields practical improvements for reading pre-modern documents under seal interference and provides an interactive visualization pipeline, with code released for reproducibility.

Abstract

Kuzushiji was one of the most popular writing styles in pre-modern Japan and was widely used in both personal letters and official documents. However, due to its highly cursive forms and extensive glyph variations, most modern Japanese readers cannot directly interpret Kuzushiji characters. Therefore, recent research has focused on developing automated Kuzushiji character recognition methods, which have achieved satisfactory performance on relatively clean Kuzushiji document images. However, existing methods struggle to maintain recognition accuracy under seal interference (e.g., when seals overlap characters), despite the frequent occurrence of seals in pre-modern Japanese documents. To address this challenge, we propose a three-stage restoration-guided Kuzushiji character recognition (RG-KCR) framework specifically designed to mitigate seal interference. We construct datasets for evaluating Kuzushiji character detection (Stage 1) and classification (Stage 3). Experimental results show that the YOLOv12-medium model achieves a precision of 98.0% and a recall of 93.3% on the constructed test set. We quantitatively evaluate the restoration performance of Stage 2 using PSNR and SSIM. In addition, we conduct an ablation study to demonstrate that Stage 2 improves the Top-1 accuracy of Metom, a Vision Transformer (ViT)-based Kuzushiji classifier employed in Stage 3, from 93.45% to 95.33%. The implementation code of this work is available at https://ruiyangju.github.io/RG-KCR.

Restoration-Guided Kuzushiji Character Recognition Framework under Seal Interference

TL;DR

This work tackles Kuzushiji recognition when red seals overlap characters, which degrades accuracy. It introduces RG-KCR, a three-stage framework that first detects characters, then applies a training-free restoration to remove seal artifacts before classifying individual characters with a ViT-based Metom model, and finally overlays the recognized modern Japanese characters onto restored documents for readable output. Key contributions include a training-free color-based seal removal method, construction of a 1,000-image detection dataset with synthetic seals, and an ablation showing restoration boosts Metom Top-1 accuracy from 93.45% to 95.33% (with a modest 0.51 s per image overhead), along with strong detection performance from YOLOv12-medium (AP50:97.0, precision 98.0%, recall 93.9%). The approach yields practical improvements for reading pre-modern documents under seal interference and provides an interactive visualization pipeline, with code released for reproducibility.

Abstract

Kuzushiji was one of the most popular writing styles in pre-modern Japan and was widely used in both personal letters and official documents. However, due to its highly cursive forms and extensive glyph variations, most modern Japanese readers cannot directly interpret Kuzushiji characters. Therefore, recent research has focused on developing automated Kuzushiji character recognition methods, which have achieved satisfactory performance on relatively clean Kuzushiji document images. However, existing methods struggle to maintain recognition accuracy under seal interference (e.g., when seals overlap characters), despite the frequent occurrence of seals in pre-modern Japanese documents. To address this challenge, we propose a three-stage restoration-guided Kuzushiji character recognition (RG-KCR) framework specifically designed to mitigate seal interference. We construct datasets for evaluating Kuzushiji character detection (Stage 1) and classification (Stage 3). Experimental results show that the YOLOv12-medium model achieves a precision of 98.0% and a recall of 93.3% on the constructed test set. We quantitatively evaluate the restoration performance of Stage 2 using PSNR and SSIM. In addition, we conduct an ablation study to demonstrate that Stage 2 improves the Top-1 accuracy of Metom, a Vision Transformer (ViT)-based Kuzushiji classifier employed in Stage 3, from 93.45% to 95.33%. The implementation code of this work is available at https://ruiyangju.github.io/RG-KCR.
Paper Structure (25 sections, 2 equations, 9 figures, 5 tables)

This paper contains 25 sections, 2 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Example recognition results under seal interference, where seals overlap with Kuzushiji characters. The left column shows the input document region containing the Kuzushiji characters "尚書堂梓", and the right column shows the corresponding recognition outputs produced by three systems: Fuminoha toppan2023fuminoha, NDLkotenOCR-Lite toru2024development, and Metom imajuku2024metom. The top row uses the raw document as input, while the bottom row reports the recognition results after applying our document restoration method.
  • Figure 2: The pipeline of the proposed RG-KCR framework, consisting of three stages: Kuzushiji character detection (Stage 1), Kuzushiji document restoration (Stage 2), and Kuzushiji character classification (Stage 3).
  • Figure 3: Qualitative comparison of line-level and character-level Kuzushiji detection. Red bounding boxes show line-level detections produced by NDLkotenOCR-Lite toru2024development, while green bounding boxes show character-level detections produced by the YOLOv12-medium tian2025yolov12 detector adopted in this work.
  • Figure 4: Examples of low-confidence bounding boxes produced by the detection model. Stains in pre-modern documents can cause false positives, where background artifacts are mistakenly detected as Kuzushiji characters with confidence scores $<0.01$.
  • Figure 5: Examples of incomplete annotations in the raw dataset. Red boxes indicate our corrected annotations, and green boxes indicate the original annotations.
  • ...and 4 more figures