Algebraic Machine Learning: Learning as computing an algebraic decomposition of a task
Fernando Martin-Maroto, Nabil Abderrahaman, David Mendez, Gonzalo G. de Polavieja
TL;DR
AML presents an algebraic foundation for learning by encoding tasks as axioms in a semilattice and deriving a freest model via Full Crossing; learning progresses by extracting discriminative atoms and forming generalizing subsets with Sparse Crossing guided by trace invariants. The approach demonstrates data-driven learning without architecture search, matching MNIST-family benchmarks and enabling formal problem solving such as Hamiltonian cycles without search. The key contributions include atomized semilattices, the Full Crossing/freest model framework, trace-based duals, and a scalable Sparse Crossing algorithm that yields compact, interpretable rule-atoms with potential for hybrid algebraic-statistical systems. Overall, AML offers a transparent, additive, and potentially explainable pathway to learning that scales through algebraic decomposition rather than gradient-based optimization.
Abstract
Statistics and Optimization are foundational to modern Machine Learning. Here, we propose an alternative foundation based on Abstract Algebra, with mathematics that facilitates the analysis of learning. In this approach, the goal of the task and the data are encoded as axioms of an algebra, and a model is obtained where only these axioms and their logical consequences hold. Although this is not a generalizing model, we show that selecting specific subsets of its breakdown into algebraic atoms obtained via subdirect decomposition gives a model that generalizes. We validate this new learning principle on standard datasets such as MNIST, FashionMNIST, CIFAR-10, and medical images, achieving performance comparable to optimized multilayer perceptrons. Beyond data-driven tasks, the new learning principle extends to formal problems, such as finding Hamiltonian cycles from their specifications and without relying on search. This algebraic foundation offers a fresh perspective on machine intelligence, featuring direct learning from training data without the need for validation dataset, scaling through model additivity, and asymptotic convergence to the underlying rule in the data.
