Indoor Obstacle Discovery on Reflective Ground via Monocular Camera

Feng Xue; Yicong Chang; Tianxi Wang; Yu Zhou; Anlong Ming

Indoor Obstacle Discovery on Reflective Ground via Monocular Camera

Feng Xue, Yicong Chang, Tianxi Wang, Yu Zhou, Anlong Ming

TL;DR

This work tackles indoor obstacle discovery with reflective flooring using a monocular camera. It introduces a pre-calibration based ground-detection scheme to robustly estimate the ground plane despite reflections, and a ground-pixel parallax cue that discriminates real obstacles from reflections when paired with an appearance-geometry feature set. An appearance-geometry fusion model (AGFM) improves proposal re-scoring by jointly leveraging geometric parallax and appearance cues, and a weight-decayed map yields more complete obstacle segmentation. The authors contribute the Obstacle on Reflective Ground (ORG) dataset, enabling pixel- and instance-level evaluation across challenging reflective scenarios. Results show substantial reduction in misdetections from reflections and robustness to motion blur and motion noise, with practical implications for safe indoor navigation using inexpensive monocular cameras.

Abstract

Visual obstacle discovery is a key step towards autonomous navigation of indoor mobile robots. Successful solutions have many applications in multiple scenes. One of the exceptions is the reflective ground. In this case, the reflections on the floor resemble the true world, which confuses the obstacle discovery and leaves navigation unsuccessful. We argue that the key to this problem lies in obtaining discriminative features for reflections and obstacles. Note that obstacle and reflection can be separated by the ground plane in 3D space. With this observation, we firstly introduce a pre-calibration based ground detection scheme that uses robot motion to predict the ground plane. Due to the immunity of robot motion to reflection, this scheme avoids failed ground detection caused by reflection. Given the detected ground, we design a ground-pixel parallax to describe the location of a pixel relative to the ground. Based on this, a unified appearance-geometry feature representation is proposed to describe objects inside rectangular boxes. Eventually, based on segmenting by detection framework, an appearance-geometry fusion regressor is designed to utilize the proposed feature to discover the obstacles. It also prevents our model from concentrating too much on parts of obstacles instead of whole obstacles. For evaluation, we introduce a new dataset for Obstacle on Reflective Ground (ORG), which comprises 15 scenes with various ground reflections, a total of more than 200 image sequences and 3400 RGB images. The pixel-wise annotations of ground and obstacle provide a comparison to our method and other methods. By reducing the misdetection of the reflection, the proposed approach outperforms others. The source code and the dataset will be available at https://github.com/XuefengBUPT/IndoorObstacleDiscovery-RG.

Indoor Obstacle Discovery on Reflective Ground via Monocular Camera

TL;DR

Abstract

Paper Structure (35 sections, 14 equations, 21 figures, 8 tables)

This paper contains 35 sections, 14 equations, 21 figures, 8 tables.

Introduction
Related Work
The Method Segmenting Obstacle by Detection
Occlusion Edge and Region Proposal
Homography of Planar Surface
Method
Overview
Ground Plane Detection via Pre-calibration
Ground-Pixel Parallax of Occlusion Edge Point
Appearance-Geometry Feature Representation
Appearance-Geometry Fusion Model
Model Structure
Training Data
Prediction
Weight-decayed Scheme for Obstacle-occupied Map
...and 20 more sections

Figures (21)

Figure 1: Results of BiSeNet BiSeNet, FCN FCN, Xue et al. ICRA and our method on an exemplary scene of the proposed dataset. True positives are marked in green, red for false positives, blue for false negatives. Yellow boxes mark the mis-classified pixels. Magenta circles indicate the reflection.
Figure 2: The result obtained by previous method ICRA. (a) the occlusion edge map. (b) the region proposals with high score (marked in red boxes). (c) the obstacle-occupied probability map constructed by these high-score boxes. (d) the final obstacle masks (using threshold 0.49 to segment the obstacles). In the grayscale images (a)(c), the value of each pixel ranges from 0 to 1. The darker it is, the closer it is to 1.
Figure 3: The pipeline of our method. The inputs are consecutive RGB images and the robot pose corresponding to the two images. I-IV are the byproducts. V is the output, an obstacle-occupied probability map. In I and V, the value of each pixel ranges from 0 to 1. The darker it is, the closer it is to 1. In III, the points are divided into two types by setting threshold: the red points are above the ground, the green points are below the ground.
Figure 4: (a) the case that the observed point is above the ground. (b) the case that the observed point is below the ground. For each case, the points are zoomed in to the right-side images to obtain a clear view.
Figure 5: Exampler images and pixel-level annotations taken from the proposed dataset. The floor is marked in yellow, and green for obstacle. The images are zoomed in to clearly show these obstacles. The right-side image exhibits our platform, a Kobuki robot that provides the odometer data.
...and 16 more figures

Indoor Obstacle Discovery on Reflective Ground via Monocular Camera

TL;DR

Abstract

Indoor Obstacle Discovery on Reflective Ground via Monocular Camera

Authors

TL;DR

Abstract

Table of Contents

Figures (21)