Towards Making Flowchart Images Machine Interpretable
Shreya Shukla, Prajwal Gatti, Yogesh Kumar, Vikash Yadav, Anand Mishra
TL;DR
Flowcharts in educational and technical documents are not readily machine-interpretable, hindering automatic code generation. The paper introduces FloCo-T5, a transformer-based framework that first converts flowchart images into a textual sequence encoding via shape detection and OCR, then pre-trains CodeT5 on logic-preserving augmented codes with a masked token objective before fine-tuning for Python code generation. The FloCo dataset comprises 11,884 flowchart–Python code pairs, enabling large-scale evaluation and benchmarking of Flow2Code; FloCo-T5 outperforms Vanilla Transformer, BART, PLBART, and CodeT5 across BLEU, CodeBLEU, and Exact Match metrics, with the modified string encoding providing the best performance. Hand-drawn flowcharts show promising, albeit lower, results, suggesting robustness to non-digitized inputs and highlighting future directions for longer code sequences and broader diagram types. Overall, this work provides a scalable benchmark and a strong methodological foundation for automatically translating flowcharts into executable code, with potential impact on education and software development tooling.
Abstract
Computer programming textbooks and software documentations often contain flowcharts to illustrate the flow of an algorithm or procedure. Modern OCR engines often tag these flowcharts as graphics and ignore them in further processing. In this paper, we work towards making flowchart images machine-interpretable by converting them to executable Python codes. To this end, inspired by the recent success in natural language to code generation literature, we present a novel transformer-based framework, namely FloCo-T5. Our model is well-suited for this task,as it can effectively learn semantics, structure, and patterns of programming languages, which it leverages to generate syntactically correct code. We also used a task-specific pre-training objective to pre-train FloCo-T5 using a large number of logic-preserving augmented code samples. Further, to perform a rigorous study of this problem, we introduce theFloCo dataset that contains 11,884 flowchart images and their corresponding Python codes. Our experiments show promising results, and FloCo-T5 clearly outperforms related competitive baselines on code generation metrics. We make our dataset and implementation publicly available.
