Table of Contents
Fetching ...

SEMv3: A Fast and Robust Approach to Table Separation Line Detection

Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du

TL;DR

SEMv3 tackles robust table structure recognition under deformations and wireless layouts by integrating a Keypoints Offset Regression (KOR) split module with a grid-based merge module. The split stage regresses offsets from predefined keypoint proposals to accurately locate row/column separation lines, while the merge stage uses merge actions on grid cells to assemble the final table structure, all trained end-to-end. Across ICDAR-2019 cTDaR Historical, WTW, and iFLYTAB, SEMv3 achieves state-of-the-art results and demonstrates clear speed advantages over prior split-based methods, especially on large or irregular tables. The approach reduces reliance on heavy post-processing, improves robustness to challenging visuals, and maintains efficiency through concise representations and operations, making it practical for real-world document analysis tasks.

Abstract

Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines. During the split stage, we introduce a Keypoint Offset Regression (KOR) module, which effectively detects table separation lines by directly regressing the offset of each line relative to its keypoint proposals. Moreover, in the merge stage, we define a series of merge actions to efficiently describe the table structure based on table grids. Extensive ablation studies demonstrate that our proposed KOR module can detect table separation lines quickly and accurately. Furthermore, on public datasets (e.g. WTW, ICDAR-2019 cTDaR Historical and iFLYTAB), SEMv3 achieves state-of-the-art (SOTA) performance. The code is available at https://github.com/Chunchunwumu/SEMv3.

SEMv3: A Fast and Robust Approach to Table Separation Line Detection

TL;DR

SEMv3 tackles robust table structure recognition under deformations and wireless layouts by integrating a Keypoints Offset Regression (KOR) split module with a grid-based merge module. The split stage regresses offsets from predefined keypoint proposals to accurately locate row/column separation lines, while the merge stage uses merge actions on grid cells to assemble the final table structure, all trained end-to-end. Across ICDAR-2019 cTDaR Historical, WTW, and iFLYTAB, SEMv3 achieves state-of-the-art results and demonstrates clear speed advantages over prior split-based methods, especially on large or irregular tables. The approach reduces reliance on heavy post-processing, improves robustness to challenging visuals, and maintains efficiency through concise representations and operations, making it practical for real-world document analysis tasks.

Abstract

Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines. During the split stage, we introduce a Keypoint Offset Regression (KOR) module, which effectively detects table separation lines by directly regressing the offset of each line relative to its keypoint proposals. Moreover, in the merge stage, we define a series of merge actions to efficiently describe the table structure based on table grids. Extensive ablation studies demonstrate that our proposed KOR module can detect table separation lines quickly and accurately. Furthermore, on public datasets (e.g. WTW, ICDAR-2019 cTDaR Historical and iFLYTAB), SEMv3 achieves state-of-the-art (SOTA) performance. The code is available at https://github.com/Chunchunwumu/SEMv3.
Paper Structure (22 sections, 9 equations, 6 figures, 5 tables)

This paper contains 22 sections, 9 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: The three types of table separation detection methods.
  • Figure 2: An overview of our approach, SEMv3, which follows the split-and-merge paradigm. In the split stage, we propose the keypoints offset regression module (KOR) to locate the table separation line. The KOR module detects the table separation lines by regressing the offset of each line relative to its keypoint proposals. The "FE" stands for feature enhancement module. In the merge stage, we utilize a merge action to describe the table structure based on table grids.
  • Figure 3: Definition of merge action based on grids. (a) The girds based table structure. The starting grids are painted orange. The black dashed lines represent the borders within the grid that need to be merged. (b) The merge action map.
  • Figure 4: Qualitative table structure recognition results of our approach. (a1-a3) are from WTW. (b1) and (b2) are from ICDAR-2019 cTDaR Historical. (c1-c4) are from the wired subset of iFLYTAB. (c5) is from the wireless subset of iFLYTAB.
  • Figure 5: Qualitative split results of different split modules. (a) is the prediction of table grids from KOR, (b) is the prediction of table grids of table grids from IS. The red dashed box indicates low-quality results.
  • ...and 1 more figures