Table of Contents
Fetching ...

ShiftedBronzes: Benchmarking and Analysis of Domain Fine-Grained Classification in Open-World Settings

Rixin Zhou, Honglin Pang, Qian Zhang, Ruihua Qi, Xi Yang, Chuntao Li

TL;DR

This paper tackles open-world fine-grained classification in archaeology by introducing ShiftedBronzes, a benchmark for bronze ware dating that introduces distribution shifts between in-distribution data (Ding/Gui) and seven OOD data types, plus transferred data to simulate realistic domain challenges. It evaluates six FGVC methods for dating and eighteen OOD detection methods across post-hoc, VLM-based, and generation-based families, showing that pre-trained Vision-Language Model approaches generally outperform others, with ID-like prompting in VLMs yielding robust results. The work reveals that domain-specific knowledge and the handling of subtle, domain-relevant distribution shifts are crucial for effective OOD detection and dating in this field, and provides nuanced insights into how different OOD strategies respond to specialized data. ShiftedBronzes offers a comprehensive resource for advancing archaeology-centric FGVC and domain-aware OOD detection research, with dataset and code to be released subsequently.

Abstract

In real-world applications across specialized domains, addressing complex out-of-distribution (OOD) challenges is a common and significant concern. In this study, we concentrate on the task of fine-grained bronze ware dating, a critical aspect in the study of ancient Chinese history, and developed a benchmark dataset named ShiftedBronzes. By extensively expanding the bronze Ding dataset, ShiftedBronzes incorporates two types of bronze ware data and seven types of OOD data, which exhibit distribution shifts commonly encountered in bronze ware dating scenarios. We conduct benchmarking experiments on ShiftedBronzes and five commonly used general OOD datasets, employing a variety of widely adopted post-hoc, pre-trained Vision Large Model (VLM)-based and generation-based OOD detection methods. Through analysis of the experimental results, we validate previous conclusions regarding post-hoc, VLM-based, and generation-based methods, while also highlighting their distinct behaviors on specialized datasets. These findings underscore the unique challenges of applying general OOD detection methods to domain-specific tasks such as bronze ware dating. We hope that the ShiftedBronzes benchmark provides valuable insights into both the field of bronze ware dating and the and the development of OOD detection methods. The dataset and associated code will be available later.

ShiftedBronzes: Benchmarking and Analysis of Domain Fine-Grained Classification in Open-World Settings

TL;DR

This paper tackles open-world fine-grained classification in archaeology by introducing ShiftedBronzes, a benchmark for bronze ware dating that introduces distribution shifts between in-distribution data (Ding/Gui) and seven OOD data types, plus transferred data to simulate realistic domain challenges. It evaluates six FGVC methods for dating and eighteen OOD detection methods across post-hoc, VLM-based, and generation-based families, showing that pre-trained Vision-Language Model approaches generally outperform others, with ID-like prompting in VLMs yielding robust results. The work reveals that domain-specific knowledge and the handling of subtle, domain-relevant distribution shifts are crucial for effective OOD detection and dating in this field, and provides nuanced insights into how different OOD strategies respond to specialized data. ShiftedBronzes offers a comprehensive resource for advancing archaeology-centric FGVC and domain-aware OOD detection research, with dataset and code to be released subsequently.

Abstract

In real-world applications across specialized domains, addressing complex out-of-distribution (OOD) challenges is a common and significant concern. In this study, we concentrate on the task of fine-grained bronze ware dating, a critical aspect in the study of ancient Chinese history, and developed a benchmark dataset named ShiftedBronzes. By extensively expanding the bronze Ding dataset, ShiftedBronzes incorporates two types of bronze ware data and seven types of OOD data, which exhibit distribution shifts commonly encountered in bronze ware dating scenarios. We conduct benchmarking experiments on ShiftedBronzes and five commonly used general OOD datasets, employing a variety of widely adopted post-hoc, pre-trained Vision Large Model (VLM)-based and generation-based OOD detection methods. Through analysis of the experimental results, we validate previous conclusions regarding post-hoc, VLM-based, and generation-based methods, while also highlighting their distinct behaviors on specialized datasets. These findings underscore the unique challenges of applying general OOD detection methods to domain-specific tasks such as bronze ware dating. We hope that the ShiftedBronzes benchmark provides valuable insights into both the field of bronze ware dating and the and the development of OOD detection methods. The dataset and associated code will be available later.

Paper Structure

This paper contains 27 sections, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Examples of our proposed dataset and a general OOD dataset (OpenImage-O wang2022vim). Our dataset include 2 types of ID data (a-b) and 7 types OOD data (c-i). (a) and (b) are typical images of bronze Ding and Gui from four dynasties (Shang, Western Zhou, Spring and Autumn, Warring States), which together form the ID data for the bronze ware dating task. (c-f) are sketch images and rubbing images of Ding and Gui. (g) and (h) are generated images by applying a zero-shot material transfer model to bronze ware and container images. (i) are container images sourced from the ImageNet-21K ridnik2021imagenetk dataset. (j) are examples from OpenImage-O. In terms of bronze ware dating scenarios, the distribution shift between the ID data and the OOD data increases from left to right within the green region.
  • Figure 2: (a) Statistical of expert-annotated knowledge within the bronze ware dataset. (b) Statistics of categories within container data. (c) Feature visualization of different image within the ShiftedBronzes. Features are extracted using ResNet-50 (left) and ViT-B-16 (right) models, both pre-trained on the ImageNet-1K dataset, followed by dimensionality reduction via t-SNE.
  • Figure 3: The detailed process for collecting container data.
  • Figure 4: The generation process of transferred container and transferred bronze data using the zero-shot material transfer model ZeST cheng2025zest.
  • Figure 5: The comparison of FPR@95 performance of 18 OOD detection methods on the OOD data in ShiftedBronzes and five general OOD datasets. The black dashed line represents the average performance across all OOD data. The green dashed line indicates the average performance on the hard OOD data (container, transferred bronze, transferred container, sketch and rubbing). The red dashed line indicates the average performance on the easy OOD data (Species, ImageNet-O, iNaturalist, Texture, OpenImage-O).
  • ...and 2 more figures