Restoration-Guided Kuzushiji Character Recognition Framework under Seal Interference
Rui-Yang Ju, Kohei Yamashita, Hirotaka Kameko, Shinsuke Mori
TL;DR
This work tackles Kuzushiji recognition when red seals overlap characters, which degrades accuracy. It introduces RG-KCR, a three-stage framework that first detects characters, then applies a training-free restoration to remove seal artifacts before classifying individual characters with a ViT-based Metom model, and finally overlays the recognized modern Japanese characters onto restored documents for readable output. Key contributions include a training-free color-based seal removal method, construction of a 1,000-image detection dataset with synthetic seals, and an ablation showing restoration boosts Metom Top-1 accuracy from 93.45% to 95.33% (with a modest 0.51 s per image overhead), along with strong detection performance from YOLOv12-medium (AP50:97.0, precision 98.0%, recall 93.9%). The approach yields practical improvements for reading pre-modern documents under seal interference and provides an interactive visualization pipeline, with code released for reproducibility.
Abstract
Kuzushiji was one of the most popular writing styles in pre-modern Japan and was widely used in both personal letters and official documents. However, due to its highly cursive forms and extensive glyph variations, most modern Japanese readers cannot directly interpret Kuzushiji characters. Therefore, recent research has focused on developing automated Kuzushiji character recognition methods, which have achieved satisfactory performance on relatively clean Kuzushiji document images. However, existing methods struggle to maintain recognition accuracy under seal interference (e.g., when seals overlap characters), despite the frequent occurrence of seals in pre-modern Japanese documents. To address this challenge, we propose a three-stage restoration-guided Kuzushiji character recognition (RG-KCR) framework specifically designed to mitigate seal interference. We construct datasets for evaluating Kuzushiji character detection (Stage 1) and classification (Stage 3). Experimental results show that the YOLOv12-medium model achieves a precision of 98.0% and a recall of 93.3% on the constructed test set. We quantitatively evaluate the restoration performance of Stage 2 using PSNR and SSIM. In addition, we conduct an ablation study to demonstrate that Stage 2 improves the Top-1 accuracy of Metom, a Vision Transformer (ViT)-based Kuzushiji classifier employed in Stage 3, from 93.45% to 95.33%. The implementation code of this work is available at https://ruiyangju.github.io/RG-KCR.
