Table of Contents
Fetching ...

Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach

Huanyu Liu, Jianfeng Cai, Tingjia Zhang, Hongsheng Li, Siyuan Wang, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang

TL;DR

Flowmind2digital tackles the challenge of converting hand-drawn flowminds into editable digital diagrams by combining a Mask-RCNN-based object-and-keypoint detector with a robust post-processing pipeline that places shapes, connectors, and text into PPT/Visio. The hdFlowmind dataset, with 1,776 images and 27,804 annotations, provides broad real-world variability to train robust models, and the approach achieves 87.3% accuracy, surpassing prior methods by 11.9%. Key innovations include two keypoints per connector, heatmap-based keypoint detection, and Canopy+K-means automatic typesetting to align and resize elements, plus OCR-guided text decoding within ROIs. The work demonstrates practical viability through software docking (PPTX/Visio APIs) and releases a lightweight Visio-Python toolkit, with potential impact on rapid digitization of brainstorming sketches across education and industry.

Abstract

Flowcharts and mind maps, collectively known as flowmind, are vital in daily activities, with hand-drawn versions facilitating real-time collaboration. However, there's a growing need to digitize them for efficient processing. Automated conversion methods are essential to overcome manual conversion challenges. Existing sketch recognition methods face limitations in practical situations, being field-specific and lacking digital conversion steps. Our paper introduces the Flowmind2digital method and hdFlowmind dataset to address these challenges. Flowmind2digital, utilizing neural networks and keypoint detection, achieves a record 87.3% accuracy on our dataset, surpassing previous methods by 11.9%. The hdFlowmind dataset, comprising 1,776 annotated flowminds across 22 scenarios, outperforms existing datasets. Additionally, our experiments emphasize the importance of simple graphics, enhancing accuracy by 9.3%.

Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach

TL;DR

Flowmind2digital tackles the challenge of converting hand-drawn flowminds into editable digital diagrams by combining a Mask-RCNN-based object-and-keypoint detector with a robust post-processing pipeline that places shapes, connectors, and text into PPT/Visio. The hdFlowmind dataset, with 1,776 images and 27,804 annotations, provides broad real-world variability to train robust models, and the approach achieves 87.3% accuracy, surpassing prior methods by 11.9%. Key innovations include two keypoints per connector, heatmap-based keypoint detection, and Canopy+K-means automatic typesetting to align and resize elements, plus OCR-guided text decoding within ROIs. The work demonstrates practical viability through software docking (PPTX/Visio APIs) and releases a lightweight Visio-Python toolkit, with potential impact on rapid digitization of brainstorming sketches across education and industry.

Abstract

Flowcharts and mind maps, collectively known as flowmind, are vital in daily activities, with hand-drawn versions facilitating real-time collaboration. However, there's a growing need to digitize them for efficient processing. Automated conversion methods are essential to overcome manual conversion challenges. Existing sketch recognition methods face limitations in practical situations, being field-specific and lacking digital conversion steps. Our paper introduces the Flowmind2digital method and hdFlowmind dataset to address these challenges. Flowmind2digital, utilizing neural networks and keypoint detection, achieves a record 87.3% accuracy on our dataset, surpassing previous methods by 11.9%. The hdFlowmind dataset, comprising 1,776 annotated flowminds across 22 scenarios, outperforms existing datasets. Additionally, our experiments emphasize the importance of simple graphics, enhancing accuracy by 9.3%.
Paper Structure (19 sections, 1 equation, 22 figures, 10 tables)

This paper contains 19 sections, 1 equation, 22 figures, 10 tables.

Figures (22)

  • Figure 1: Flowminds used in Various Scenarios
  • Figure 2: Visio Interface
  • Figure 3: Example of a flowmind with various highlighted recognition challenges
  • Figure 4: Various Backgrounds in Flowminds
  • Figure 5: Recognition Challenges Resulting from Different Methods of Digitizing Flowminds
  • ...and 17 more figures