Table of Contents
Fetching ...

AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning

Bowen Ping, Minnan Luo, Zhuohang Dang, Chenxi Wang, Chengyou Jia

TL;DR

AutoGPS integrates a Multimodal Problem Formalizer and a Deductive Symbolic Reasoner to transform geometry problems into a formal language and solve them via hypergraph-based, syllogistic reasoning. The framework achieves state-of-the-art results on Geometry3K and PGPS9K while delivering $99\%$ stepwise coherence in human evaluations, addressing reliability and interpretability gaps of prior neural and symbolic methods. By enforcing a minimal, traceable reasoning subgraph, AutoGPS provides concise, human-readable proofs and robust performance under noisy inputs. This neuro-symbolic approach highlights a promising direction for formal, verifiable reasoning in multimodal mathematical problem solving.

Abstract

Geometry problem solving presents distinctive challenges in artificial intelligence, requiring exceptional multimodal comprehension and rigorous mathematical reasoning capabilities. Existing approaches typically fall into two categories: neural-based and symbolic-based methods, both of which exhibit limitations in reliability and interpretability. To address this challenge, we propose AutoGPS, a neuro-symbolic collaborative framework that solves geometry problems with concise, reliable, and human-interpretable reasoning processes. Specifically, AutoGPS employs a Multimodal Problem Formalizer (MPF) and a Deductive Symbolic Reasoner (DSR). The MPF utilizes neural cross-modal comprehension to translate geometry problems into structured formal language representations, with feedback from DSR collaboratively. The DSR takes the formalization as input and formulates geometry problem solving as a hypergraph expansion task, executing mathematically rigorous and reliable derivation to produce minimal and human-readable stepwise solutions. Extensive experimental evaluations demonstrate that AutoGPS achieves state-of-the-art performance on benchmark datasets. Furthermore, human stepwise-reasoning evaluation confirms AutoGPS's impressive reliability and interpretability, with 99\% stepwise logical coherence. The project homepage is at https://jayce-ping.github.io/AutoGPS-homepage.

AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning

TL;DR

AutoGPS integrates a Multimodal Problem Formalizer and a Deductive Symbolic Reasoner to transform geometry problems into a formal language and solve them via hypergraph-based, syllogistic reasoning. The framework achieves state-of-the-art results on Geometry3K and PGPS9K while delivering stepwise coherence in human evaluations, addressing reliability and interpretability gaps of prior neural and symbolic methods. By enforcing a minimal, traceable reasoning subgraph, AutoGPS provides concise, human-readable proofs and robust performance under noisy inputs. This neuro-symbolic approach highlights a promising direction for formal, verifiable reasoning in multimodal mathematical problem solving.

Abstract

Geometry problem solving presents distinctive challenges in artificial intelligence, requiring exceptional multimodal comprehension and rigorous mathematical reasoning capabilities. Existing approaches typically fall into two categories: neural-based and symbolic-based methods, both of which exhibit limitations in reliability and interpretability. To address this challenge, we propose AutoGPS, a neuro-symbolic collaborative framework that solves geometry problems with concise, reliable, and human-interpretable reasoning processes. Specifically, AutoGPS employs a Multimodal Problem Formalizer (MPF) and a Deductive Symbolic Reasoner (DSR). The MPF utilizes neural cross-modal comprehension to translate geometry problems into structured formal language representations, with feedback from DSR collaboratively. The DSR takes the formalization as input and formulates geometry problem solving as a hypergraph expansion task, executing mathematically rigorous and reliable derivation to produce minimal and human-readable stepwise solutions. Extensive experimental evaluations demonstrate that AutoGPS achieves state-of-the-art performance on benchmark datasets. Furthermore, human stepwise-reasoning evaluation confirms AutoGPS's impressive reliability and interpretability, with 99\% stepwise logical coherence. The project homepage is at https://jayce-ping.github.io/AutoGPS-homepage.

Paper Structure

This paper contains 27 sections, 5 equations, 15 figures, 6 tables, 1 algorithm.

Figures (15)

  • Figure 1: Failure cases of current methods. Left: Qwen2.5-VL-32B-Instruct exhibits hallucination-induced errors during reasoning, producing an incorrect conclusion. The symbolic method (Inter-GPS) also fails and lacks traceable steps for error diagnosis. Right: Inadequate cross-modal understanding leads to incomplete formalization, further obstructing symbolic solving. Blue/red annotations indicate correct/erroneous reasoning or answers.
  • Figure 2: Performance comparison among existing methods.
  • Figure 3: Overview of the proposed AutoGPS framework. Top: The Multimodal Problem Formalizer processes geometry problem image-text pairs through annotation and parsing to obtain pre-formalization results, which are subsequently completed via cross-modal alignment and then refined according to the feedback from the Deductive Symbolic Reasoner. Bottom: The Deductive Symbolic Reasoner first validates the self-consistency of formalization and complete implicit relations, then formulates the problem-solving process as a hypergraph expansion task. It iteratively applies two complementary strategies to expand the hypergraph to derive the answer, and ultimately backtracks through the minimal reasoning subgraph to construct a human-readable stepwise solution.
  • Figure 4: Reasoning reliability evaluation.
  • Figure 5: Formalization quality comparison.
  • ...and 10 more figures