Table of Contents
Fetching ...

Relation Networks for Object Detection

Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, Yichen Wei

TL;DR

This work introduces an object relation module that jointly reasons over a set of object proposals by combining appearance similarities with a translation-invariant geometric weight to model inter-object relations. Integrated into region-based detectors, it enhances instance recognition and replaces heuristic NMS with a learnable duplicate removal network, enabling end-to-end training. Extensive ablations on COCO demonstrate consistent gains across backbones and detectors, with the geometric weight and multiple relation components providing notable improvements. The approach offers a lightweight, plug-in building block that advances end-to-end object detection by exploiting object–object relations without requiring extra supervision.

Abstract

Although it is well believed for years that modeling relations between objects would help object recognition, there has not been evidence that the idea is working in the deep learning era. All state-of-the-art object detection systems still rely on recognizing object instances individually, without exploiting their relations during learning. This work proposes an object relation module. It processes a set of objects simultaneously through interaction between their appearance feature and geometry, thus allowing modeling of their relations. It is lightweight and in-place. It does not require additional supervision and is easy to embed in existing networks. It is shown effective on improving object recognition and duplicate removal steps in the modern object detection pipeline. It verifies the efficacy of modeling object relations in CNN based detection. It gives rise to the first fully end-to-end object detector.

Relation Networks for Object Detection

TL;DR

This work introduces an object relation module that jointly reasons over a set of object proposals by combining appearance similarities with a translation-invariant geometric weight to model inter-object relations. Integrated into region-based detectors, it enhances instance recognition and replaces heuristic NMS with a learnable duplicate removal network, enabling end-to-end training. Extensive ablations on COCO demonstrate consistent gains across backbones and detectors, with the geometric weight and multiple relation components providing notable improvements. The approach offers a lightweight, plug-in building block that advances end-to-end object detection by exploiting object–object relations without requiring extra supervision.

Abstract

Although it is well believed for years that modeling relations between objects would help object recognition, there has not been evidence that the idea is working in the deep learning era. All state-of-the-art object detection systems still rely on recognizing object instances individually, without exploiting their relations during learning. This work proposes an object relation module. It processes a set of objects simultaneously through interaction between their appearance feature and geometry, thus allowing modeling of their relations. It is lightweight and in-place. It does not require additional supervision and is easy to embed in existing networks. It is shown effective on improving object recognition and duplicate removal steps in the modern object detection pipeline. It verifies the efficacy of modeling object relations in CNN based detection. It gives rise to the first fully end-to-end object detector.

Paper Structure

This paper contains 14 sections, 8 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Current state-of-the-art object detectors are based on a four-step pipeline. Our object relation module (illustrated as red dashed boxes) can be conveniently adopted to improve both instance recognition and duplicate removal steps, resulting in an end-to-end object detector.
  • Figure 2: Left: object relation module as Eq. (\ref{['eq.object_relation_module_ensemble']}); Right: relation feature computation as Eq. (\ref{['eq.object_relation_module']}).
  • Figure 3: Illustration of enhanced 2fc head (a) and duplicate classification network (b) by object relation modules.
  • Figure 4: Representative examples with high relation weights in Eq. (\ref{['eq.object_relation_weight']}). The reference object $n$ is blue. The other objects contributing a high weight (shown on the top-left) are yellow.