Table of Contents
Fetching ...

MorphText: Deep Morphology Regularized Arbitrary-shape Scene Text Detection

Chengpei Xu, Wenjing Jia, Ruomei Wang, Xiaonan Luo, Xiangjian He

TL;DR

This paper addresses bottom-up arbitrary-shape scene text detection by tackling false segment detections and missing links between segments. It introduces MorphText, which embeds two trainable deep morphological modules, DMOP and DMCL, into an end-to-end framework to regularize text segments and connect them. DMOP performs a learned morphological opening to suppress false detections, while DMCL applies a learned closing to bridge gaps and shape text segments along their principal orientation. Across CTW1500, Total-Text, MSRA-TD500, and ICDAR2017, MorphText achieves state-of-the-art performance, demonstrating the effectiveness and robustness of integrating deep morphology with CNN-based segment proposals to reduce post-processing and improve linkage without heavy GCN-based reasoning.

Abstract

Bottom-up text detection methods play an important role in arbitrary-shape scene text detection but there are two restrictions preventing them from achieving their great potential, i.e., 1) the accumulation of false text segment detections, which affects subsequent processing, and 2) the difficulty of building reliable connections between text segments. Targeting these two problems, we propose a novel approach, named ``MorphText", to capture the regularity of texts by embedding deep morphology for arbitrary-shape text detection. Towards this end, two deep morphological modules are designed to regularize text segments and determine the linkage between them. First, a Deep Morphological Opening (DMOP) module is constructed to remove false text segment detections generated in the feature extraction process. Then, a Deep Morphological Closing (DMCL) module is proposed to allow text instances of various shapes to stretch their morphology along their most significant orientation while deriving their connections. Extensive experiments conducted on four challenging benchmark datasets (CTW1500, Total-Text, MSRA-TD500 and ICDAR2017) demonstrate that our proposed MorphText outperforms both top-down and bottom-up state-of-the-art arbitrary-shape scene text detection approaches.

MorphText: Deep Morphology Regularized Arbitrary-shape Scene Text Detection

TL;DR

This paper addresses bottom-up arbitrary-shape scene text detection by tackling false segment detections and missing links between segments. It introduces MorphText, which embeds two trainable deep morphological modules, DMOP and DMCL, into an end-to-end framework to regularize text segments and connect them. DMOP performs a learned morphological opening to suppress false detections, while DMCL applies a learned closing to bridge gaps and shape text segments along their principal orientation. Across CTW1500, Total-Text, MSRA-TD500, and ICDAR2017, MorphText achieves state-of-the-art performance, demonstrating the effectiveness and robustness of integrating deep morphology with CNN-based segment proposals to reduce post-processing and improve linkage without heavy GCN-based reasoning.

Abstract

Bottom-up text detection methods play an important role in arbitrary-shape scene text detection but there are two restrictions preventing them from achieving their great potential, i.e., 1) the accumulation of false text segment detections, which affects subsequent processing, and 2) the difficulty of building reliable connections between text segments. Targeting these two problems, we propose a novel approach, named ``MorphText", to capture the regularity of texts by embedding deep morphology for arbitrary-shape text detection. Towards this end, two deep morphological modules are designed to regularize text segments and determine the linkage between them. First, a Deep Morphological Opening (DMOP) module is constructed to remove false text segment detections generated in the feature extraction process. Then, a Deep Morphological Closing (DMCL) module is proposed to allow text instances of various shapes to stretch their morphology along their most significant orientation while deriving their connections. Extensive experiments conducted on four challenging benchmark datasets (CTW1500, Total-Text, MSRA-TD500 and ICDAR2017) demonstrate that our proposed MorphText outperforms both top-down and bottom-up state-of-the-art arbitrary-shape scene text detection approaches.
Paper Structure (23 sections, 11 equations, 10 figures, 7 tables)

This paper contains 23 sections, 11 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Both the GCN-based method 11 and the top-down method 9 have failed (as shown in (a) and (b)) when, the text instance is separated due to heavy occlusion. With our morphology regularization, such separated text segments can still be connected into a single text instance (as shown in (c)).
  • Figure 2: Our proposed MorphText approach effectively addresses two key issues that restrain the performance of the bottom-up methods. The pink boxes indicate the false detection areas accumulated from the earlier processing and the green boxes indicate the disconnected areas.
  • Figure 3: The overall structure of our network, where "1/4,64", "1/8,128",... and "1/32,512" indicate the resize ratio and the channel number.
  • Figure 4: The visualization of the learned structure elements through DMOP.
  • Figure 5: Visualization of the learned structure elements through DMCL.
  • ...and 5 more figures