Table of Contents
Fetching ...

RTHDet: Rotate Table Area and Head Detection in images

Wenxing Hu, Minglei Tong

TL;DR

This work tackles the problem of detecting rotated tables and localizing their head and tail, introducing the TRR360D dataset and the R360 AP metric to evaluate both region and semantic localization. The proposed RTHDet extends a fast RTMDet-S baseline with a new $D_{r360}$ angle definition and an Angle Loss (AL) branch, enabling 360° rotation feature learning and accurate head-tail localization. Through transfer learning and adaptive boundary rotation augmentation, RTHDet achieves a substantial improvement in AP50($T<90$) from $23.7\%$ to $88.7\%$, validating its effectiveness for rotated table recognition tasks and integration into MMRotate. The results demonstrate practical potential for downstream OCR and table recognition pipelines, while highlighting remaining challenges for perspective distortions and arbitrary quadrilateral tables.

Abstract

Traditional models focus on horizontal table detection but struggle in rotating contexts, limiting progress in table recognition. This paper introduces a new task: detecting table regions and localizing head-tail parts in rotation scenarios. We propose corresponding datasets, evaluation metrics, and methods. Our novel method, 'Adaptively Bounded Rotation,' addresses dataset scarcity in detecting rotated tables and their head-tail parts. We produced 'TRR360D,' a dataset incorporating semantic information of table head and tail, based on 'ICDAR2019MTD.' A new metric, 'R360 AP,' measures precision in detecting rotated regions and localizing head-tail parts. Our baseline, the high-speed and accurate 'RTMDet-S,' is chosen after extensive review and testing. We introduce 'RTHDet,' enhancing the baseline with a 'r360' rotated rectangle angle representation and an 'Angle Loss' branch, improving head-tail localization. By applying transfer learning and adaptive boundary rotation augmentation, RTHDet's AP50 (T<90) improved from 23.7% to 88.7% compared to the baseline. This demonstrates RTHDet's effectiveness in detecting rotating table regions and accurately localizing head and tail parts.RTHDet is integrated into the widely-used open-source MMRotate toolkit: https://github.com/open-mmlab/mmrotate/tree/dev-1.x/projects/RR360.

RTHDet: Rotate Table Area and Head Detection in images

TL;DR

This work tackles the problem of detecting rotated tables and localizing their head and tail, introducing the TRR360D dataset and the R360 AP metric to evaluate both region and semantic localization. The proposed RTHDet extends a fast RTMDet-S baseline with a new angle definition and an Angle Loss (AL) branch, enabling 360° rotation feature learning and accurate head-tail localization. Through transfer learning and adaptive boundary rotation augmentation, RTHDet achieves a substantial improvement in AP50() from to , validating its effectiveness for rotated table recognition tasks and integration into MMRotate. The results demonstrate practical potential for downstream OCR and table recognition pipelines, while highlighting remaining challenges for perspective distortions and arbitrary quadrilateral tables.

Abstract

Traditional models focus on horizontal table detection but struggle in rotating contexts, limiting progress in table recognition. This paper introduces a new task: detecting table regions and localizing head-tail parts in rotation scenarios. We propose corresponding datasets, evaluation metrics, and methods. Our novel method, 'Adaptively Bounded Rotation,' addresses dataset scarcity in detecting rotated tables and their head-tail parts. We produced 'TRR360D,' a dataset incorporating semantic information of table head and tail, based on 'ICDAR2019MTD.' A new metric, 'R360 AP,' measures precision in detecting rotated regions and localizing head-tail parts. Our baseline, the high-speed and accurate 'RTMDet-S,' is chosen after extensive review and testing. We introduce 'RTHDet,' enhancing the baseline with a 'r360' rotated rectangle angle representation and an 'Angle Loss' branch, improving head-tail localization. By applying transfer learning and adaptive boundary rotation augmentation, RTHDet's AP50 (T<90) improved from 23.7% to 88.7% compared to the baseline. This demonstrates RTHDet's effectiveness in detecting rotating table regions and accurately localizing head and tail parts.RTHDet is integrated into the widely-used open-source MMRotate toolkit: https://github.com/open-mmlab/mmrotate/tree/dev-1.x/projects/RR360.
Paper Structure (23 sections, 17 equations, 17 figures, 3 tables, 2 algorithms)

This paper contains 23 sections, 17 equations, 17 figures, 3 tables, 2 algorithms.

Figures (17)

  • Figure 1: Table Detection Task: (a) Detection of Table Horizontal Regions (b) Detection of Table Rotation Regions (c) Detection of Table Head and Tail
  • Figure 2: $D_{oc}$:the old version OpenCV definition,when $OpenCV<4.5.1$, $angle \in [-90,0 ^\circ)$, $\theta \in [-\frac{\pi}{2},0)$, and the angle between the width and the positive x-axis is a positive acute angle. At this time, the width edge will exchange as the angle changes. This definition comes from the cv2.minAreaRect function in OpenCV, which returns a value in the range $[-90^\circ, 0)$.
  • Figure 3: $D_{oc}'$:the new version of OpenCV,when $OpenCV\geq4.5.1$, $angle \in(0,90^\circ]$,$\theta \in (0,\frac{\pi}{2}]$, and the angle between width and the positive x-axis is acute (positive), based on the $cv2.minAreaRect$ function in OpenCV, which returns a value in the range of $(0,90^\circ]$.
  • Figure 4: $D_{le90}$: Long Edge 90 Definition, $angle\in[-90^\circ,90^\circ]$, $\theta\in[-\frac{\pi}{2},\frac{\pi}{2}]$, and $width>height$.
  • Figure 5: $D_{le135}$: Long Edge 135 Definition,$angle \in [-45^\circ,135^\circ]$, $\theta \in [-\frac{\pi}{4},\frac{3\pi}{4}]$, and $width>height$.
  • ...and 12 more figures