Table of Contents
Fetching ...

Explainable AI Insights for Symbolic Computation: A case study on selecting the variable ordering for cylindrical algebraic decomposition

Lynn Pickering, Tereso Del Rio Almajano, Matthew England, Kelly Cohen

TL;DR

The paper tackles the problem of selecting variable orderings in Cylindrical Algebraic Decomposition (CAD), a decision with a major impact on performance. It applies SHAP, an explainable AI tool, to a preexisting ML pipeline that predicts CAD orderings from 81 algorithmically derived features, balancing the data to improve interpretability. Through local and global SHAP analyses, the authors identify features that align with established heuristics and uncover novel features that can drive human-level heuristics. By merging and ranking these features across four models, they construct new greedy heuristics that outperform prior state-of-the-art methods on three-variable problems, while maintaining interpretability and avoiding AI dependencies in deployed software. The work demonstrates a practical pathway for using AI-derived insights to guide CAD algorithm design and suggests broader applicability of XAI-driven heuristic development in symbolic computation.

Abstract

In recent years there has been increased use of machine learning (ML) techniques within mathematics, including symbolic computation where it may be applied safely to optimise or select algorithms. This paper explores whether using explainable AI (XAI) techniques on such ML models can offer new insight for symbolic computation, inspiring new implementations within computer algebra systems that do not directly call upon AI tools. We present a case study on the use of ML to select the variable ordering for cylindrical algebraic decomposition. It has already been demonstrated that ML can make the choice well, but here we show how the SHAP tool for explainability can be used to inform new heuristics of a size and complexity similar to those human-designed heuristics currently commonly used in symbolic computation.

Explainable AI Insights for Symbolic Computation: A case study on selecting the variable ordering for cylindrical algebraic decomposition

TL;DR

The paper tackles the problem of selecting variable orderings in Cylindrical Algebraic Decomposition (CAD), a decision with a major impact on performance. It applies SHAP, an explainable AI tool, to a preexisting ML pipeline that predicts CAD orderings from 81 algorithmically derived features, balancing the data to improve interpretability. Through local and global SHAP analyses, the authors identify features that align with established heuristics and uncover novel features that can drive human-level heuristics. By merging and ranking these features across four models, they construct new greedy heuristics that outperform prior state-of-the-art methods on three-variable problems, while maintaining interpretability and avoiding AI dependencies in deployed software. The work demonstrates a practical pathway for using AI-derived insights to guide CAD algorithm design and suggests broader applicability of XAI-driven heuristic development in symbolic computation.

Abstract

In recent years there has been increased use of machine learning (ML) techniques within mathematics, including symbolic computation where it may be applied safely to optimise or select algorithms. This paper explores whether using explainable AI (XAI) techniques on such ML models can offer new insight for symbolic computation, inspiring new implementations within computer algebra systems that do not directly call upon AI tools. We present a case study on the use of ML to select the variable ordering for cylindrical algebraic decomposition. It has already been demonstrated that ML can make the choice well, but here we show how the SHAP tool for explainability can be used to inform new heuristics of a size and complexity similar to those human-designed heuristics currently commonly used in symbolic computation.
Paper Structure (36 sections, 3 equations, 7 figures, 10 tables)

This paper contains 36 sections, 3 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: CADs sign-invariant for the set of polynomials $\{x^5+5 x^4+5 x^3-5 x^2-6x-2y\}.$ Using ordering $x\succ y$, we obtain a CAD with 57 cells (18 shaded areas, 27 curve segments and 12 points). Using the ordering $y\succ x$ generates only 3 cells (2 areas and one curve).
  • Figure 2: SHAP waterfall plot reproduced from an online tutorial (https://medium.com/dataman-in-ai/the-shap-with-more-elegant-charts-bc3e73fa1c0c) where it appears as the second image in Section 3.2. The plot explains the prediction by an XGBoost model for whether a particular wine has a rating in the top half of the quality scale, using the "Red Wine Quality" Kaggle dataset (https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009).
  • Figure 3: An explanation of a decision made by the MLP model on an example CAD problem instance, for the selected output ordering, ordering 5: $x_3 \succ x_1 \succ x_2$
  • Figure 4: The top five features for each model from aggregating over the 100 test points. Ordered by average impact on model output magnitude, and colored by output class.
  • Figure 5: Plot of feature scores from Table \ref{['tab: top_features_overall']}
  • ...and 2 more figures