DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

Yi-Xin Huang; Hou-I Liu; Hong-Han Shuai; Wen-Huang Cheng

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai, Wen-Huang Cheng

TL;DR

This work presents a simple yet effective model, named DQ-DETR, which consists of three different components: categorical counting module, counting-guided feature enhancement, and dynamic query selection to solve the problems of tiny object detection.

Abstract

Despite previous DETR-like methods having performed successfully in generic object detection, tiny object detection is still a challenging task for them since the positional information of object queries is not customized for detecting tiny objects, whose scale is extraordinarily smaller than general objects. Also, DETR-like methods using a fixed number of queries make them unsuitable for aerial datasets, which only contain tiny objects, and the numbers of instances are imbalanced between different images. Thus, we present a simple yet effective model, named DQ-DETR, which consists of three different components: categorical counting module, counting-guided feature enhancement, and dynamic query selection to solve the above-mentioned problems. DQ-DETR uses the prediction and density maps from the categorical counting module to dynamically adjust the number of object queries and improve the positional information of queries. Our model DQ-DETR outperforms previous CNN-based and DETR-like methods, achieving state-of-the-art mAP 30.2% on the AI-TOD-V2 dataset, which mostly consists of tiny objects. Our code will be available at https://github.com/hoiliu-0801/DQ-DETR.

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

TL;DR

Abstract

Paper Structure (32 sections, 8 equations, 2 figures, 11 tables, 1 algorithm)

This paper contains 32 sections, 8 equations, 2 figures, 11 tables, 1 algorithm.

Introduction
Related work
Tiny Object Detection
DETR-like Methods
Method
Overview
Unflattening of Encoder's Feature Map
Categorical Counting Module
Density Extractor.
Counting Number Classification.
Counting-Guided Feature Enhancement Module (CGFE)
Spatial cross-attention map.
Channel attention map.
Dynamic Query Selection
Number of queries.
...and 17 more sections

Figures (2)

Figure 1: The overall architecture of our method. (a) Categorical Counting Module, which classifies the number of instances in images into 4 levels. (b) Counting-Guided Feature Enhancement, which refines the encoder's visual feature with a density map. (c) Dynamic Query Selection, which dynamically adjusts the number of queries and enhances the content and position of queries.
Figure B1: Visualization of detection results and feature maps. The green, red, and blue boxes represent TP, FP, and FN, respectively.

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

TL;DR

Abstract

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (2)