Table of Contents
Fetching ...

Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram

Ming-Liang Zhang, Zhong-Zhi Li, Fei Yin, Liang Lin, Cheng-Lin Liu

TL;DR

This work tackles geometry problem solving under multi-modal inputs by introducing PGPSNet-v2, a neural–symbolic solver that fuses geometry diagrams, textual problems, and parsed diagram clauses through structural‑semantic pre-training, generates an interpretable solution program with a self-limited decoder, and uses a multi-level theorem verifier to ensure adherence to geometric principles. The authors release PGPS9K, a large-scale dataset with fine-grained diagram annotations, textual clauses, solution programs, a theorem knowledge base, and a program executor to support robust GPS research. Experimental results on Geometry3K and PGPS9K show PGPSNet-v2 surpasses existing symbolic and neural solvers, with ablations demonstrating the critical roles of fusion, reasoning, and verification in achieving reliable, explainable geometry problem solving. The work advances GPS by providing a concrete neural–symbolic framework, a rich annotated dataset, and a public verifier-driven evaluation paradigm that emphasizes correctness and interpretability for geometric reasoning.

Abstract

Geometry problem solving (GPS) requires capacities of multi-modal understanding, multi-hop reasoning and theorem knowledge application. In this paper, we propose a neural-symbolic model for plane geometry problem solving (PGPS), named PGPSNet-v2, with three key steps: modal fusion, reasoning process and knowledge verification. In modal fusion, we leverage textual clauses to express fine-grained structural and semantic content of geometry diagram, and fuse diagram with textual problem efficiently through structural-semantic pre-training. For reasoning, we design an explicable solution program to describe the geometric reasoning process, and employ a self-limited decoder to generate solution program autoregressively. To reduce solution errors, a multi-level theorem verifier is proposed to eliminate solutions that do not match geometric principles, alleviating the hallucination of the neural model. We also construct a large-scale geometry problem dataset called PGPS9K, containing fine-grained annotations of textual clauses, solution program and involved knowledge tuples. Extensive experiments on datasets Geometry3K and PGPS9K show that our PGPSNet solver outperforms existing symbolic and neural solvers in GPS performance, while maintaining good explainability and reliability, and the solver components (fusion, reasoning, verification) are all justified effective.

Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram

TL;DR

This work tackles geometry problem solving under multi-modal inputs by introducing PGPSNet-v2, a neural–symbolic solver that fuses geometry diagrams, textual problems, and parsed diagram clauses through structural‑semantic pre-training, generates an interpretable solution program with a self-limited decoder, and uses a multi-level theorem verifier to ensure adherence to geometric principles. The authors release PGPS9K, a large-scale dataset with fine-grained diagram annotations, textual clauses, solution programs, a theorem knowledge base, and a program executor to support robust GPS research. Experimental results on Geometry3K and PGPS9K show PGPSNet-v2 surpasses existing symbolic and neural solvers, with ablations demonstrating the critical roles of fusion, reasoning, and verification in achieving reliable, explainable geometry problem solving. The work advances GPS by providing a concrete neural–symbolic framework, a rich annotated dataset, and a public verifier-driven evaluation paradigm that emphasizes correctness and interpretability for geometric reasoning.

Abstract

Geometry problem solving (GPS) requires capacities of multi-modal understanding, multi-hop reasoning and theorem knowledge application. In this paper, we propose a neural-symbolic model for plane geometry problem solving (PGPS), named PGPSNet-v2, with three key steps: modal fusion, reasoning process and knowledge verification. In modal fusion, we leverage textual clauses to express fine-grained structural and semantic content of geometry diagram, and fuse diagram with textual problem efficiently through structural-semantic pre-training. For reasoning, we design an explicable solution program to describe the geometric reasoning process, and employ a self-limited decoder to generate solution program autoregressively. To reduce solution errors, a multi-level theorem verifier is proposed to eliminate solutions that do not match geometric principles, alleviating the hallucination of the neural model. We also construct a large-scale geometry problem dataset called PGPS9K, containing fine-grained annotations of textual clauses, solution program and involved knowledge tuples. Extensive experiments on datasets Geometry3K and PGPS9K show that our PGPSNet solver outperforms existing symbolic and neural solvers in GPS performance, while maintaining good explainability and reliability, and the solver components (fusion, reasoning, verification) are all justified effective.
Paper Structure (31 sections, 4 equations, 10 figures, 7 tables, 1 algorithm)

This paper contains 31 sections, 4 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: Framework comparison of existing geometric solvers. (a) symbolic solvers; (b) neural solvers; (c) our PGPSNet-v2 solver.
  • Figure 2: Some examples of geometry problems in our PGPS9K dataset.
  • Figure 3: Annotation design of solution program and its interpretability.
  • Figure 4: Pipeline of PGPSNet-v2 solver.
  • Figure 5: Schematic flowchart of structural-semantic pre-training. [M] denotes the masked token. Semantic tags [G], [N], [ARG], [P], [AGD] represent tokens of general, variable, argument, point and angle ID. Section tags [S], [C], [T] refer to tokens of structure, condition and target.
  • ...and 5 more figures