Table of Contents
Fetching ...

Mitigating Long-tail Distribution in Oracle Bone Inscriptions: Dataset, Model, and Benchmark

Jinhao Li, Zijian Chen, Runze Jiang, Tingzhu Chen, Changbo Wang, Guangtao Zhai

TL;DR

This work tackles long-tail bias in oracle bone inscription recognition by constructing Oracle-P15K, a structure-aligned dataset with 14,542 images and expert-crafted glyphs. It then introduces OBIDiff, a diffusion-based generator with glyph and style encoders and a CLIP-based style representation to transfer rubbing textures while preserving glyph structure. The dataset and model are shown to improve OBI generation quality, enhance downstream recognition and denoising performance, and provide a realistic benchmark for four noise types. The results, along with user preference studies, support the viability of structure-aligned data and controllable synthesis for cultural heritage restoration and misinformation mitigation.

Abstract

The oracle bone inscription (OBI) recognition plays a significant role in understanding the history and culture of ancient China. However, the existing OBI datasets suffer from a long-tail distribution problem, leading to biased performance of OBI recognition models across majority and minority classes. With recent advancements in generative models, OBI synthesis-based data augmentation has become a promising avenue to expand the sample size of minority classes. Unfortunately, current OBI datasets lack large-scale structure-aligned image pairs for generative model training. To address these problems, we first present the Oracle-P15K, a structure-aligned OBI dataset for OBI generation and denoising, consisting of 14,542 images infused with domain knowledge from OBI experts. Second, we propose a diffusion model-based pseudo OBI generator, called OBIDiff, to achieve realistic and controllable OBI generation. Given a clean glyph image and a target rubbing-style image, it can effectively transfer the noise style of the original rubbing to the glyph image. Extensive experiments on OBI downstream tasks and user preference studies show the effectiveness of the proposed Oracle-P15K dataset and demonstrate that OBIDiff can accurately preserve inherent glyph structures while transferring authentic rubbing styles effectively.

Mitigating Long-tail Distribution in Oracle Bone Inscriptions: Dataset, Model, and Benchmark

TL;DR

This work tackles long-tail bias in oracle bone inscription recognition by constructing Oracle-P15K, a structure-aligned dataset with 14,542 images and expert-crafted glyphs. It then introduces OBIDiff, a diffusion-based generator with glyph and style encoders and a CLIP-based style representation to transfer rubbing textures while preserving glyph structure. The dataset and model are shown to improve OBI generation quality, enhance downstream recognition and denoising performance, and provide a realistic benchmark for four noise types. The results, along with user preference studies, support the viability of structure-aligned data and controllable synthesis for cultural heritage restoration and misinformation mitigation.

Abstract

The oracle bone inscription (OBI) recognition plays a significant role in understanding the history and culture of ancient China. However, the existing OBI datasets suffer from a long-tail distribution problem, leading to biased performance of OBI recognition models across majority and minority classes. With recent advancements in generative models, OBI synthesis-based data augmentation has become a promising avenue to expand the sample size of minority classes. Unfortunately, current OBI datasets lack large-scale structure-aligned image pairs for generative model training. To address these problems, we first present the Oracle-P15K, a structure-aligned OBI dataset for OBI generation and denoising, consisting of 14,542 images infused with domain knowledge from OBI experts. Second, we propose a diffusion model-based pseudo OBI generator, called OBIDiff, to achieve realistic and controllable OBI generation. Given a clean glyph image and a target rubbing-style image, it can effectively transfer the noise style of the original rubbing to the glyph image. Extensive experiments on OBI downstream tasks and user preference studies show the effectiveness of the proposed Oracle-P15K dataset and demonstrate that OBIDiff can accurately preserve inherent glyph structures while transferring authentic rubbing styles effectively.

Paper Structure

This paper contains 24 sections, 3 equations, 14 figures, 5 tables.

Figures (14)

  • Figure 1: Construction pipeline of our Oracle-P15K. Step-1: We focus on four types of noise in OBI rubbings and take them as targets for source content sampling. Step-2: OBI experts are invited to create glyph images manually. Step-3: We conduct a post-quality examination to ensure the reliability and alignment of OBI image pairs.
  • Figure 2: Comparison of the OBI generation results on the Oracle-P15K dataset. The red, green, and blue boxes denote the glyph, style, and generated images, respectively. Three evaluation settings are considered. (1) Few-shot: the characters in the training and validation sets are the same. (2) Zero-shot: the characters in the training and test sets are not overlapped. (3) Personalized: the glyph and style images are not aligned.
  • Figure 3: Architecture of our OBIDiff.
  • Figure 4: Feature distribution comparisons among generated images from baseline methods and our OBIDiff.
  • Figure 5: Recognition accuracy of augmented rare OBI images w.r.t. style image at different scales.
  • ...and 9 more figures