Table of Contents
Fetching ...

BPDO:Boundary Points Dynamic Optimization for Arbitrary Shape Scene Text Detection

Jinzhi Zheng, Libo Zhang, Yanjun Wu, Chen Zhao

TL;DR

BPDO tackles arbitrary-shaped scene text detection by introducing a two-stage boundary-point approach: first generate boundary-point proposals via a text-aware module, then iteratively refine them with a Dynamic Optimization Module that leverages deformable attention and neighborhood information. The method fuses multi-scale features through channel and spatial attention to produce priors (distance, direction, and classification maps) guiding boundary point placement. A multi-objective loss combining $L_{cls}$, $L_{dis}$, $L_{dir}$, and $L_{pm}$ guides training, with epoch-dependent weighting. Experiments on MSRA-TD500, CTW1500, and Total-Text show BPDO achieving state-of-the-art or competitive results, particularly surpassing prior methods on MSRA-TD500, demonstrating the effectiveness of boundary-point dynamic optimization for irregular text shapes.

Abstract

Arbitrary shape scene text detection is of great importance in scene understanding tasks. Due to the complexity and diversity of text in natural scenes, existing scene text algorithms have limited accuracy for detecting arbitrary shape text. In this paper, we propose a novel arbitrary shape scene text detector through boundary points dynamic optimization(BPDO). The proposed model is designed with a text aware module (TAM) and a boundary point dynamic optimization module (DOM). Specifically, the model designs a text aware module based on segmentation to obtain boundary points describing the central region of the text by extracting a priori information about the text region. Then, based on the idea of deformable attention, it proposes a dynamic optimization model for boundary points, which gradually optimizes the exact position of the boundary points based on the information of the adjacent region of each boundary point. Experiments on CTW-1500, Total-Text, and MSRA-TD500 datasets show that the model proposed in this paper achieves a performance that is better than or comparable to the state-of-the-art algorithm, proving the effectiveness of the model.

BPDO:Boundary Points Dynamic Optimization for Arbitrary Shape Scene Text Detection

TL;DR

BPDO tackles arbitrary-shaped scene text detection by introducing a two-stage boundary-point approach: first generate boundary-point proposals via a text-aware module, then iteratively refine them with a Dynamic Optimization Module that leverages deformable attention and neighborhood information. The method fuses multi-scale features through channel and spatial attention to produce priors (distance, direction, and classification maps) guiding boundary point placement. A multi-objective loss combining , , , and guides training, with epoch-dependent weighting. Experiments on MSRA-TD500, CTW1500, and Total-Text show BPDO achieving state-of-the-art or competitive results, particularly surpassing prior methods on MSRA-TD500, demonstrating the effectiveness of boundary-point dynamic optimization for irregular text shapes.

Abstract

Arbitrary shape scene text detection is of great importance in scene understanding tasks. Due to the complexity and diversity of text in natural scenes, existing scene text algorithms have limited accuracy for detecting arbitrary shape text. In this paper, we propose a novel arbitrary shape scene text detector through boundary points dynamic optimization(BPDO). The proposed model is designed with a text aware module (TAM) and a boundary point dynamic optimization module (DOM). Specifically, the model designs a text aware module based on segmentation to obtain boundary points describing the central region of the text by extracting a priori information about the text region. Then, based on the idea of deformable attention, it proposes a dynamic optimization model for boundary points, which gradually optimizes the exact position of the boundary points based on the information of the adjacent region of each boundary point. Experiments on CTW-1500, Total-Text, and MSRA-TD500 datasets show that the model proposed in this paper achieves a performance that is better than or comparable to the state-of-the-art algorithm, proving the effectiveness of the model.
Paper Structure (11 sections, 4 equations, 4 figures, 2 tables)

This paper contains 11 sections, 4 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Comparison of BPDO with other segmentation-based scene text algorithms. (a) Segmentation-based algorithm for scene text detection. (b) Scene Text Segmentation Algorithm for boundary points optimisation. (c) BPDO. Each boundary point is progressively optimized for the text region through neighborhood information.
  • Figure 2: The overall architecture of the Boundary Points Dynamic Optimization. $L\_dis$ and $L\_cls$ represent distance map and segmentation map. $L\_dirx$ and $L\_diry$ represent directional map in the x and y directions, respectively.
  • Figure 3: The detailed structure of the text aware module.
  • Figure 4: Visual results of our algorithm on four datasets. The green irregular polygon is the detected text contours.