Table of Contents
Fetching ...

EDC: Equation Discovery for Classification

Guus Toussaint, Arno Knobbe

TL;DR

Equation Discovery for Classification (EDC) introduces an interpretable framework that learns an analytic boundary function $f(x)$ with $T = true \iff f(x) \ge 0$ by composing summands through a configurable grammar. The approach combines beam-search guided equation discovery with adaptive optimisation (SGD for differentiable constants and Hill Climber otherwise), achieving competitive binary classification performance while maintaining interpretability. Across artificial and UCI datasets, EDC demonstrates strong structure recovery, robustness to noise, and favorable trade-offs between accuracy and model transparency, albeit with longer runtime than many black-box methods. The work advances explainable AI by showing that a single, human-readable equation can rival state-of-the-art classifiers on several tasks and can be tailored via domain-specific grammar extensions.

Abstract

Equation Discovery techniques have shown considerable success in regression tasks, where they are used to discover concise and interpretable models (\textit{Symbolic Regression}). In this paper, we propose a new ED-based binary classification framework. Our proposed method EDC finds analytical functions of manageable size that specify the location and shape of the decision boundary. In extensive experiments on artificial and real-life data, we demonstrate how EDC is able to discover both the structure of the target equation as well as the value of its parameters, outperforming the current state-of-the-art ED-based classification methods in binary classification and achieving performance comparable to the state of the art in binary classification. We suggest a grammar of modest complexity that appears to work well on the tested datasets but argue that the exact grammar -- and thus the complexity of the models -- is configurable, and especially domain-specific expressions can be included in the pattern language, where that is required. The presented grammar consists of a series of summands (additive terms) that include linear, quadratic and exponential terms, as well as products of two features (producing hyperbolic curves ideal for capturing XOR-like dependencies). The experiments demonstrate that this grammar allows fairly flexible decision boundaries while not so rich to cause overfitting.

EDC: Equation Discovery for Classification

TL;DR

Equation Discovery for Classification (EDC) introduces an interpretable framework that learns an analytic boundary function with by composing summands through a configurable grammar. The approach combines beam-search guided equation discovery with adaptive optimisation (SGD for differentiable constants and Hill Climber otherwise), achieving competitive binary classification performance while maintaining interpretability. Across artificial and UCI datasets, EDC demonstrates strong structure recovery, robustness to noise, and favorable trade-offs between accuracy and model transparency, albeit with longer runtime than many black-box methods. The work advances explainable AI by showing that a single, human-readable equation can rival state-of-the-art classifiers on several tasks and can be tailored via domain-specific grammar extensions.

Abstract

Equation Discovery techniques have shown considerable success in regression tasks, where they are used to discover concise and interpretable models (\textit{Symbolic Regression}). In this paper, we propose a new ED-based binary classification framework. Our proposed method EDC finds analytical functions of manageable size that specify the location and shape of the decision boundary. In extensive experiments on artificial and real-life data, we demonstrate how EDC is able to discover both the structure of the target equation as well as the value of its parameters, outperforming the current state-of-the-art ED-based classification methods in binary classification and achieving performance comparable to the state of the art in binary classification. We suggest a grammar of modest complexity that appears to work well on the tested datasets but argue that the exact grammar -- and thus the complexity of the models -- is configurable, and especially domain-specific expressions can be included in the pattern language, where that is required. The presented grammar consists of a series of summands (additive terms) that include linear, quadratic and exponential terms, as well as products of two features (producing hyperbolic curves ideal for capturing XOR-like dependencies). The experiments demonstrate that this grammar allows fairly flexible decision boundaries while not so rich to cause overfitting.

Paper Structure

This paper contains 15 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Target decision boundary (dashed red line), and decision boundary found by EDC (solid blue line). Noise is added to the data, and as a result, the target achieves a lower AUC ($0.952$) compared to the EDC algorithm ($0.978$).
  • Figure 2: Target decision boundary (dashed red line), and decision boundary found by EDC (solid blue line). The target equation is sampled from a richer grammar than available to EDC. The target achieves a lower AUC ($0.959$) compared to EDC ($0.967$).
  • Figure 3: The proposed decision boundary for the Gaussian clusters artificial dataset. Note that the discovered equation indicated above the figure is produced after translating the equation back to the non-normalised space. This introduces two additional constants for each feature $x_i$.
  • Figure 4: Critical distance plot of the ranks for the different classifiers for the UCI datasets. The top bar shows the critical distance ($CD$), which in our setup equals $3.49$. EDC outperforms AMAXSC, M4GP, and the decision tree, although not statistically significant. Similarly, MLP, RF, SVM, and LDA perform better but not statistically significantly.