Table of Contents
Fetching ...

Towards Ancient Plant Seed Classification: A Benchmark Dataset and Baseline Model

Rui Xing, Runmin Cong, Yingying Wu, Can Wang, Zhongming Tang, Fen Wang, Hao Wu, Sam Kwong

TL;DR

The paper addresses the lack of standardized data and robust methods for ancient plant seed classification by introducing the Ancient Plant Seed Image Classification (APS) dataset and a dedicated network, APSNet. APSNet combines a Size Perception and Embedding (SPE) module to encode seed size as a discriminative prior with an Asynchronous Decoupled Decoding (ADD) architecture to separately leverage channel and spatial information, including a supervised contrastive objective. Empirical results show APSNet achieving 90.2–90.5% accuracy on APS, outperforming 28 baselines across long-tail and fine-grained settings, and qualitative analyses (UMAP, Grad-CAM) corroborate improved separability and localization of seed features. This work provides a first benchmark and a scalable framework to propel archaeobotanical research through data-driven seed classification and size-aware feature learning.

Abstract

Understanding the dietary preferences of ancient societies and their evolution across periods and regions is crucial for revealing human-environment interactions. Seeds, as important archaeological artifacts, represent a fundamental subject of archaeobotanical research. However, traditional studies rely heavily on expert knowledge, which is often time-consuming and inefficient. Intelligent analysis methods have made progress in various fields of archaeology, but there remains a research gap in data and methods in archaeobotany, especially in the classification task of ancient plant seeds. To address this, we construct the first Ancient Plant Seed Image Classification (APS) dataset. It contains 8,340 images from 17 genus- or species-level seed categories excavated from 18 archaeological sites across China. In addition, we design a framework specifically for the ancient plant seed classification task (APSNet), which introduces the scale feature (size) of seeds based on learning fine-grained information to guide the network in discovering key "evidence" for sufficient classification. Specifically, we design a Size Perception and Embedding (SPE) module in the encoder part to explicitly extract size information for the purpose of complementing fine-grained information. We propose an Asynchronous Decoupled Decoding (ADD) architecture based on traditional progressive learning to decode features from both channel and spatial perspectives, enabling efficient learning of discriminative features. In both quantitative and qualitative analyses, our approach surpasses existing state-of-the-art image classification methods, achieving an accuracy of 90.5%. This demonstrates that our work provides an effective tool for large-scale, systematic archaeological research.

Towards Ancient Plant Seed Classification: A Benchmark Dataset and Baseline Model

TL;DR

The paper addresses the lack of standardized data and robust methods for ancient plant seed classification by introducing the Ancient Plant Seed Image Classification (APS) dataset and a dedicated network, APSNet. APSNet combines a Size Perception and Embedding (SPE) module to encode seed size as a discriminative prior with an Asynchronous Decoupled Decoding (ADD) architecture to separately leverage channel and spatial information, including a supervised contrastive objective. Empirical results show APSNet achieving 90.2–90.5% accuracy on APS, outperforming 28 baselines across long-tail and fine-grained settings, and qualitative analyses (UMAP, Grad-CAM) corroborate improved separability and localization of seed features. This work provides a first benchmark and a scalable framework to propel archaeobotanical research through data-driven seed classification and size-aware feature learning.

Abstract

Understanding the dietary preferences of ancient societies and their evolution across periods and regions is crucial for revealing human-environment interactions. Seeds, as important archaeological artifacts, represent a fundamental subject of archaeobotanical research. However, traditional studies rely heavily on expert knowledge, which is often time-consuming and inefficient. Intelligent analysis methods have made progress in various fields of archaeology, but there remains a research gap in data and methods in archaeobotany, especially in the classification task of ancient plant seeds. To address this, we construct the first Ancient Plant Seed Image Classification (APS) dataset. It contains 8,340 images from 17 genus- or species-level seed categories excavated from 18 archaeological sites across China. In addition, we design a framework specifically for the ancient plant seed classification task (APSNet), which introduces the scale feature (size) of seeds based on learning fine-grained information to guide the network in discovering key "evidence" for sufficient classification. Specifically, we design a Size Perception and Embedding (SPE) module in the encoder part to explicitly extract size information for the purpose of complementing fine-grained information. We propose an Asynchronous Decoupled Decoding (ADD) architecture based on traditional progressive learning to decode features from both channel and spatial perspectives, enabling efficient learning of discriminative features. In both quantitative and qualitative analyses, our approach surpasses existing state-of-the-art image classification methods, achieving an accuracy of 90.5%. This demonstrates that our work provides an effective tool for large-scale, systematic archaeological research.

Paper Structure

This paper contains 29 sections, 11 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: Spatial distribution of ancient plant seeds. The figure shows the spatial distribution of the collected categories, covering most regions of ancient China. Stars indicate the locations of archaeological sites, and the seed colors correspond to the temporal distribution shown in Figure \ref{['fig2']}.
  • Figure 2: Temporal distribution of ancient plant seeds. The figure shows the temporal distribution of the ruins to which the different seeds belong, spanning from 5400 BCE to 220 CE. BCE denotes “Before Common Era” and CE denotes “Common Era”.
  • Figure 3: Comparison of the size and damage of ancient plant seeds and the condition of modern seeds. We divide the seeds into four sizes, where each has three columns of seed images. The first column shows the true seed size under 1.6$\times$ magnification, the second column shows seeds with pronounced differences caused by uncontrollable factors, and the third column shows the corresponding modern seeds.
  • Figure 4: Visualization of the statistical distribution of the APS dataset. The x-axis represents the class and the y-axis represents the number of samples.
  • Figure 5: Visualization of feature distribution of the dataset. The x- and y-axes represent spatial positions.
  • ...and 6 more figures