Table of Contents
Fetching ...

DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions

Rafael Berral-Soler, Rafael Muñoz-Salinas, Rafael Medina-Carnicer, Manuel J. Marín-Jiménez

TL;DR

DeepArUco++ is a deep learning-based framework that leverages the robustness of Convolutional Neural Networks to perform marker detection and decoding in challenging lighting conditions and outperforms other state-of-the-art methods in such tasks and remains competitive even when testing on the datasets used to develop those methods.

Abstract

Fiducial markers are a computer vision tool used for object pose estimation and detection. These markers are highly useful in fields such as industry, medicine and logistics. However, optimal lighting conditions are not always available,and other factors such as blur or sensor noise can affect image quality. Classical computer vision techniques that precisely locate and decode fiducial markers often fail under difficult illumination conditions (e.g. extreme variations of lighting within the same frame). Hence, we propose DeepArUco++, a deep learning-based framework that leverages the robustness of Convolutional Neural Networks to perform marker detection and decoding in challenging lighting conditions. The framework is based on a pipeline using different Neural Network models at each step, namely marker detection, corner refinement and marker decoding. Additionally, we propose a simple method for generating synthetic data for training the different models that compose the proposed pipeline, and we present a second, real-life dataset of ArUco markers in challenging lighting conditions used to evaluate our system. The developed method outperforms other state-of-the-art methods in such tasks and remains competitive even when testing on the datasets used to develop those methods. Code available in GitHub: https://github.com/AVAuco/deeparuco/

DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions

TL;DR

DeepArUco++ is a deep learning-based framework that leverages the robustness of Convolutional Neural Networks to perform marker detection and decoding in challenging lighting conditions and outperforms other state-of-the-art methods in such tasks and remains competitive even when testing on the datasets used to develop those methods.

Abstract

Fiducial markers are a computer vision tool used for object pose estimation and detection. These markers are highly useful in fields such as industry, medicine and logistics. However, optimal lighting conditions are not always available,and other factors such as blur or sensor noise can affect image quality. Classical computer vision techniques that precisely locate and decode fiducial markers often fail under difficult illumination conditions (e.g. extreme variations of lighting within the same frame). Hence, we propose DeepArUco++, a deep learning-based framework that leverages the robustness of Convolutional Neural Networks to perform marker detection and decoding in challenging lighting conditions. The framework is based on a pipeline using different Neural Network models at each step, namely marker detection, corner refinement and marker decoding. Additionally, we propose a simple method for generating synthetic data for training the different models that compose the proposed pipeline, and we present a second, real-life dataset of ArUco markers in challenging lighting conditions used to evaluate our system. The developed method outperforms other state-of-the-art methods in such tasks and remains competitive even when testing on the datasets used to develop those methods. Code available in GitHub: https://github.com/AVAuco/deeparuco/

Paper Structure

This paper contains 26 sections, 3 equations, 19 figures, 5 tables.

Figures (19)

  • Figure 1: Main: DeepArUco++ framework. The marker detector receives as input a color image and returns a series of bounding boxes, which are used to obtain crops from the input image. Then, the corner regressor model is applied over each crop to refine the position of the corners. The detected markers are then rectified with the refined corners and used as input for the marker decoder. Finally, the marker is assigned the ID with the least Hamming distance w.r.t. the decoded bits. (Best viewed in digital format)
  • Figure 2: Some ArUco marker examples. The type of fiducial marker used in this work. The bits can be read from left to right, top to bottom (black means 0, white means 1), discarding the outer white and black border (the two outermost "rings") to produce a sequence of $36$ bits ($6 \times 6$), that is compared against the ArUco dictionary to obtain the encoded ID. (Best viewed in digital format)
  • Figure 3: Flying-ArUco v2 dataset. Using images from the COCO dataset as background, challenging training samples are created by overlapping markers with varying poses, sizes, and positions in the image. Top: detection dataset. Bottom: refinement/decoding crops. (Best viewed in digital format)
  • Figure 4: Shadow-ArUco dataset. A video containing complex lighting patterns is projected on the corner of a room, over whose walls multiple patterns have been attached; the complex lighting makes marker detection difficult for classical methods. The scene is recorded from multiple fixed camera positions. (Best viewed in digital format)
  • Figure 5: DeepTag ArUco dataset. A single ArUco marker is photographed in a fixed environment at varying distances and orientations w.r.t. the camera. (Best viewed in digital format)
  • ...and 14 more figures