RBF Weighted Hyper-Involution for RGB-D Object Detection

Mehfuz A Rahman; Khushal Das; Jiju Poovvancheri; Neil London; Dong Chen

RBF Weighted Hyper-Involution for RGB-D Object Detection

Mehfuz A Rahman, Khushal Das, Jiju Poovvancheri, Neil London, Dong Chen

TL;DR

A real-time two-stream RGBD object detection model that introduces a dynamic radial basis function (RBF) weighted depth-based hyper-involution that adjusts dynamically based on spatial interaction patterns in raw depth maps, and an up-sampling based trainable fusion layer that combines extracted depth and color image features without obstructing information transfer between them.

Abstract

A vast majority of augmented reality devices come equipped with depth and color cameras. Despite their advantages, extracting both photometric and depth features simultaneously in real-time remains challenging due to inherent differences between depth and color images. Furthermore, standard convolution operations are insufficient for extracting information directly from raw depth images, leading to inefficient intermediate representations. To address these issues, we propose a real-time two-stream RGBD object detection model. Our model introduces two new components: a dynamic radial basis function (RBF) weighted depth-based hyper-involution that adjusts dynamically based on spatial interaction patterns in raw depth maps, and an up-sampling based trainable fusion layer that combines extracted depth and color image features without obstructing information transfer between them. Experimental results demonstrate that the proposed approach achieves the strongest performance among existing RGB-D 2D object detection methods on NYU Depth V2, while remaining competitive on the SUN RGB-D benchmark.

RBF Weighted Hyper-Involution for RGB-D Object Detection

TL;DR

Abstract

Paper Structure (21 sections, 4 equations, 15 figures, 9 tables)

This paper contains 21 sections, 4 equations, 15 figures, 9 tables.

Introduction
Related Work
Detection based on HHA Formats
Detection using Raw Depth Maps
Alternatives to Standard Convolution
The Model
Two Streams Architecture
Depth Aware Hyper-involution
Fusion Stage
Implementation Details
Experiments
Performance on Benchmark Datasets
Outdoor RGB-D Dataset
Performance on Outdoor RGB-D Dataset
Feature Maps Analysis
...and 6 more sections

Figures (15)

Figure 1: Few instances where the usefulness of depth for object detection is visible. Image courtesy: Silberman:ECCV12polseno_2020ranftl2021visionrankuzz.com_2020fun_dog_tiger
Figure 2: The proposed two streams and single stage detection architecture for real-time applications.
Figure 3: Depth Aware Hyper-involution: Depth similarity creates a depth-aware filter, with a hyper-network generating weights for each image region.
Figure 4: Difference between pixels of RGB image and its corresponding depth map.
Figure 5: The filter generation hyper-network learns filter weights for each RGB pixel individually
...and 10 more figures

RBF Weighted Hyper-Involution for RGB-D Object Detection

TL;DR

Abstract

RBF Weighted Hyper-Involution for RGB-D Object Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (15)