Table of Contents
Fetching ...

Faster Bounding Box Annotation for Object Detection in Indoor Scenes

Bishwo Adhikari, Jukka Peltomäki, Jussi Puura, Heikki Huttunen

TL;DR

An approach for rapid bounding box annotation for object detection datasets, which trains deep learning based object detectors with a number of state-of-the-art models and compares them in terms of speed and accuracy.

Abstract

This paper proposes an approach for rapid bounding box annotation for object detection datasets. The procedure consists of two stages: The first step is to annotate a part of the dataset manually, and the second step proposes annotations for the remaining samples using a model trained with the first stage annotations. We experimentally study which first/second stage split minimizes to total workload. In addition, we introduce a new fully labeled object detection dataset collected from indoor scenes. Compared to other indoor datasets, our collection has more class categories, different backgrounds, lighting conditions, occlusion and high intra-class differences. We train deep learning based object detectors with a number of state-of-the-art models and compare them in terms of speed and accuracy. The fully annotated dataset is released freely available for the research community.

Faster Bounding Box Annotation for Object Detection in Indoor Scenes

TL;DR

An approach for rapid bounding box annotation for object detection datasets, which trains deep learning based object detectors with a number of state-of-the-art models and compares them in terms of speed and accuracy.

Abstract

This paper proposes an approach for rapid bounding box annotation for object detection datasets. The procedure consists of two stages: The first step is to annotate a part of the dataset manually, and the second step proposes annotations for the remaining samples using a model trained with the first stage annotations. We experimentally study which first/second stage split minimizes to total workload. In addition, we introduce a new fully labeled object detection dataset collected from indoor scenes. Compared to other indoor datasets, our collection has more class categories, different backgrounds, lighting conditions, occlusion and high intra-class differences. We train deep learning based object detectors with a number of state-of-the-art models and compare them in terms of speed and accuracy. The fully annotated dataset is released freely available for the research community.

Paper Structure

This paper contains 7 sections, 8 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Examples of object instances from TUT indoor dataset.
  • Figure 2: Example of object instances from single class label (Chair). Intra-class difference between the object instances are high in TUT indoor dataset.
  • Figure 3: Distribution of the class categories in the TUT indoor dataset. There are altogether 4595 instances from 7 categories.
  • Figure 4: The workflow of our semi-automatic bounding box image annotation method. The dataset is split into two parts. The first part is manually annotated and used to train a model, which is then used to predict labels on the rest of the dataset. After manually correcting the predicted labels, the final fully annotated dataset is combined from the manually annotated and corrected subsets.
  • Figure 5: Amount of workload needed in different train-test split (left). The proportion of annotation time needed to annotate full dataset using different portion of manually annotated data to fine-tuned the object detection model (right).