Table of Contents
Fetching ...

Algebraic Machine Learning: Learning as computing an algebraic decomposition of a task

Fernando Martin-Maroto, Nabil Abderrahaman, David Mendez, Gonzalo G. de Polavieja

TL;DR

AML presents an algebraic foundation for learning by encoding tasks as axioms in a semilattice and deriving a freest model via Full Crossing; learning progresses by extracting discriminative atoms and forming generalizing subsets with Sparse Crossing guided by trace invariants. The approach demonstrates data-driven learning without architecture search, matching MNIST-family benchmarks and enabling formal problem solving such as Hamiltonian cycles without search. The key contributions include atomized semilattices, the Full Crossing/freest model framework, trace-based duals, and a scalable Sparse Crossing algorithm that yields compact, interpretable rule-atoms with potential for hybrid algebraic-statistical systems. Overall, AML offers a transparent, additive, and potentially explainable pathway to learning that scales through algebraic decomposition rather than gradient-based optimization.

Abstract

Statistics and Optimization are foundational to modern Machine Learning. Here, we propose an alternative foundation based on Abstract Algebra, with mathematics that facilitates the analysis of learning. In this approach, the goal of the task and the data are encoded as axioms of an algebra, and a model is obtained where only these axioms and their logical consequences hold. Although this is not a generalizing model, we show that selecting specific subsets of its breakdown into algebraic atoms obtained via subdirect decomposition gives a model that generalizes. We validate this new learning principle on standard datasets such as MNIST, FashionMNIST, CIFAR-10, and medical images, achieving performance comparable to optimized multilayer perceptrons. Beyond data-driven tasks, the new learning principle extends to formal problems, such as finding Hamiltonian cycles from their specifications and without relying on search. This algebraic foundation offers a fresh perspective on machine intelligence, featuring direct learning from training data without the need for validation dataset, scaling through model additivity, and asymptotic convergence to the underlying rule in the data.

Algebraic Machine Learning: Learning as computing an algebraic decomposition of a task

TL;DR

AML presents an algebraic foundation for learning by encoding tasks as axioms in a semilattice and deriving a freest model via Full Crossing; learning progresses by extracting discriminative atoms and forming generalizing subsets with Sparse Crossing guided by trace invariants. The approach demonstrates data-driven learning without architecture search, matching MNIST-family benchmarks and enabling formal problem solving such as Hamiltonian cycles without search. The key contributions include atomized semilattices, the Full Crossing/freest model framework, trace-based duals, and a scalable Sparse Crossing algorithm that yields compact, interpretable rule-atoms with potential for hybrid algebraic-statistical systems. Overall, AML offers a transparent, additive, and potentially explainable pathway to learning that scales through algebraic decomposition rather than gradient-based optimization.

Abstract

Statistics and Optimization are foundational to modern Machine Learning. Here, we propose an alternative foundation based on Abstract Algebra, with mathematics that facilitates the analysis of learning. In this approach, the goal of the task and the data are encoded as axioms of an algebra, and a model is obtained where only these axioms and their logical consequences hold. Although this is not a generalizing model, we show that selecting specific subsets of its breakdown into algebraic atoms obtained via subdirect decomposition gives a model that generalizes. We validate this new learning principle on standard datasets such as MNIST, FashionMNIST, CIFAR-10, and medical images, achieving performance comparable to optimized multilayer perceptrons. Beyond data-driven tasks, the new learning principle extends to formal problems, such as finding Hamiltonian cycles from their specifications and without relying on search. This algebraic foundation offers a fresh perspective on machine intelligence, featuring direct learning from training data without the need for validation dataset, scaling through model additivity, and asymptotic convergence to the underlying rule in the data.

Paper Structure

This paper contains 29 sections, 39 theorems, 148 equations, 11 figures, 4 tables, 8 algorithms.

Key Result

Theorem 1

Let $\,t,s \in F_C(\emptyset)$ be two terms that represent two regular elements $\nu_M(t)$ and $\nu_M(s)$ of an atomized model $M$ over a finite set of constants $C$. Let $\phi$ be an atom, $c$ a constant in $C$ and let $a$ be a regular element of $M$:

Figures (11)

  • Figure 1: Schematic representation of the Algebraic Machine Learning pipeline. The process begins with axiomatization, where the problem, defined by data, goals, and prior knowledge, is encoded as a set of axioms. Then, we apply the Full Crossing procedure to obtain a specific model of the axioms, the freest model, a model in which the only true statements are the axioms and their logical consequences. Furthermore, the model is given explicitly as a subdirect product, expressed as basic atomic components (the atoms). Generalizing models are obtained by selecting certain subsets of atoms that collectively satisfy the axioms. In practical implementations, computing all atoms of the freest model is unnecessary; instead, a sparse variant of the Full Crossing procedure is used to directly calculate generalizing subsets of atoms.
  • Figure 2: Freest models using images with the only first column in black. (a)$16$ of the possible $3{,}375$ images with only the first column in black. Training examples are of the form $p < T_i$ with $T_i$ the term representing an image. (b) Top: Number of non-redundant atoms of the model obtained after a number of full-crossings. Middle: Number of non-redundant atoms of a given atom size for the model obtained after a number of full-crossings. Bottom: Same as Middle bur represented by several curves, each for a different atom size. (c) Atoms of the final model. (d) Example of large atoms that are part of the models at intermediate number of full-crossings.
  • Figure 3: Evolution of atoms during learning. Starting with an initial model $N_0$, the crossing of duples $r_0, r_1$ and $r_2$ produces a sequence of four models $N_0,N_1,N_2, N_3$. An atom $\phi$ in the final model $N_3$ can be tracked to an atom in each of the models $N_2, N_1$ and $N_0$ forming at least one "inward chain" of four atoms $\lambda_i \in N_i$ and $\lambda_3 = \phi$. Along the chain, the atoms either grow, i.e. the number of constants in the upper segment of $\lambda_{i}$ is larger than the number of constants in $\lambda_{i - 1}$ (red nodes), or stays the same $\lambda_{i} = \lambda_{i - 1}$ (green nodes). Some atoms, marked with a red cross, are redundant and can be discarded. The blue line indicates an inward chain from a final atom to an initial atom. In this chain, there is one atom growth, $g(\phi_{4}^{(3)})=1$, and the final atom has been successful twice since the last growth, $h(\phi_{4}^{(3)})=2$.
  • Figure 4: Sparse Crossing of hand-written digits (MNIST dataset)(a) Frequency of the number of misses for a query of whether a test image is a $7$ for test examples of digit $7$ (green) and for the other digits (red). (b) Test accuracy increases during training. (c) Distribution of atom sizes, with atom size the number of constants in the upper segment of the atom).
  • Figure 5: Hamiltonian cycles obtained using AML for different graphs. Sparse Crossing obtains Hamiltonian cycles in randomly generated graphs of variable edge density (first two graphs of the bottom row), modified Flower Snarks (SNm_124, bottom row, right), graph G7 of the FHCP set FHCP1001 (top row, left), Sheehan graphs of various sizes (top row, center) and generalized Peterson graphs (GPN_122, top row, right).
  • ...and 6 more figures

Theorems & Definitions (96)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7
  • Definition 8
  • Definition 9
  • Definition 10
  • ...and 86 more