Table of Contents
Fetching ...

AR Overlay: Training Image Pose Estimation on Curved Surface in a Synthetic Way

Sining Huang, Yukun Song, Yixiao Kang, Chang Yu

TL;DR

This paper proposes a pipeline that can detect several logo images simultaneously and only requires the original images as the input, unlocking more effects in downstream fields such as Augmented Reality (AR).

Abstract

In the field of spatial computing, one of the most essential tasks is the pose estimation of 3D objects. While rigid transformations of arbitrary 3D objects are relatively hard to detect due to varying environment introducing factors like insufficient lighting or even occlusion, objects with pre-defined shapes are often easy to track, leveraging geometric constraints. Curved images, with flexible dimensions but a confined shape, are essential shapes often targeted in 3D tracking. Traditionally, proprietary algorithms often require specific curvature measures as the input along with the original flattened images to enable pose estimation for a single image target. In this paper, we propose a pipeline that can detect several logo images simultaneously and only requires the original images as the input, unlocking more effects in downstream fields such as Augmented Reality (AR).

AR Overlay: Training Image Pose Estimation on Curved Surface in a Synthetic Way

TL;DR

This paper proposes a pipeline that can detect several logo images simultaneously and only requires the original images as the input, unlocking more effects in downstream fields such as Augmented Reality (AR).

Abstract

In the field of spatial computing, one of the most essential tasks is the pose estimation of 3D objects. While rigid transformations of arbitrary 3D objects are relatively hard to detect due to varying environment introducing factors like insufficient lighting or even occlusion, objects with pre-defined shapes are often easy to track, leveraging geometric constraints. Curved images, with flexible dimensions but a confined shape, are essential shapes often targeted in 3D tracking. Traditionally, proprietary algorithms often require specific curvature measures as the input along with the original flattened images to enable pose estimation for a single image target. In this paper, we propose a pipeline that can detect several logo images simultaneously and only requires the original images as the input, unlocking more effects in downstream fields such as Augmented Reality (AR).
Paper Structure (15 sections, 1 equation, 7 figures, 2 tables)

This paper contains 15 sections, 1 equation, 7 figures, 2 tables.

Figures (7)

  • Figure 1: AR Effects shown by ZapWorks’ single-image pose-estimation algorithm
  • Figure 2: Collection of 20 different target images (some are removed due to anonymization)
  • Figure 3: Sample of Cylinder Object Across Varied Backgrounds
  • Figure 4: Synthesized dataset consisting of paired images and corresponding data examples
  • Figure 5: Our pipeline of curved-image pose-estimation. The pipeline contains a fine-tuned YOLOv8 network yolo_2016, a self-trained CNN, a feature matching algorithm using SIFT sift, and an algorithm for PnP zhang_flexible_2000 pose computation
  • ...and 2 more figures