Enhancing Human Pose Estimation in Ancient Vase Paintings via Perceptually-grounded Style Transfer Learning

Prathmesh Madhu; Angel Villar-Corrales; Ronak Kosti; Torsten Bendschus; Corinna Reinhardt; Peter Bell; Andreas Maier; Vincent Christlein

Enhancing Human Pose Estimation in Ancient Vase Paintings via Perceptually-grounded Style Transfer Learning

Prathmesh Madhu, Angel Villar-Corrales, Ronak Kosti, Torsten Bendschus, Corinna Reinhardt, Peter Bell, Andreas Maier, Vincent Christlein

TL;DR

This work tackles the poor cross-domain generalisation of human pose estimation to ancient Greek vase paintings. It introduces a two-stage approach: first, a perceptually grounded AdaIN-based style transfer to create Styled-COCO-Persons (SCP) data that mimics vase painting style, and second, fine-tuning on a small ClassArch (CA) dataset with pose annotations. The method yields substantial improvements in pose estimation on unlabelled data (over 6% increases in mAP and mAR) and further gains when fine-tuned on CA, supported by ablations showing learning of generic domain styles and effective use of perceptual loss. Additionally, the authors demonstrate pose-based image retrieval in art collections, highlighting the practical impact for cultural heritage analytics with minimal labeling.

Abstract

Human pose estimation (HPE) is a central part of understanding the visual narration and body movements of characters depicted in artwork collections, such as Greek vase paintings. Unfortunately, existing HPE methods do not generalise well across domains resulting in poorly recognized poses. Therefore, we propose a two step approach: (1) adapting a dataset of natural images of known person and pose annotations to the style of Greek vase paintings by means of image style-transfer. We introduce a perceptually-grounded style transfer training to enforce perceptual consistency. Then, we fine-tune the base model with this newly created dataset. We show that using style-transfer learning significantly improves the SOTA performance on unlabelled data by more than 6% mean average precision (mAP) as well as mean average recall (mAR). (2) To improve the already strong results further, we created a small dataset (ClassArch) consisting of ancient Greek vase paintings from the 6-5th century BCE with person and pose annotations. We show that fine-tuning on this data with a style-transferred model improves the performance further. In a thorough ablation study, we give a targeted analysis of the influence of style intensities, revealing that the model learns generic domain styles. Additionally, we provide a pose-based image retrieval to demonstrate the effectiveness of our method.

Enhancing Human Pose Estimation in Ancient Vase Paintings via Perceptually-grounded Style Transfer Learning

TL;DR

Abstract

Enhancing Human Pose Estimation in Ancient Vase Paintings via Perceptually-grounded Style Transfer Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)