The Quantum Version Of Classification Decision Tree Constructing Algorithm C5.0
Kamil Khadiev, Ilnaz Mannapov, Liliya Safina
TL;DR
The paper addresses the computational bottleneck of constructing C5.0 decision trees on large datasets. It first introduces a classical improvement that uses a Tree Map (a self-balancing binary search tree) to store nonzero counts, reducing the per-attribute work and achieving a total runtime of $O(hd N \log N)$. It then presents a quantum version, QC5.0, which employs amplitude amplification and the Dürr-Høyer minimum search to accelerate the maximum-split attribute selection, attaining a runtime of $O\big(h\sqrt{d}N \log N \log d\big)$ with probabilistic success guarantees. Together, these results yield an almost quadratic speed-up in the number of attributes $d$, offering practical enhancements for large-scale decision-tree construction in quantum-assisted settings. The work informs both classical optimization of tree-building and the integration of quantum subroutines into machine-learning model construction.
Abstract
In the paper, we focus on complexity of C5.0 algorithm for constructing decision tree classifier that is the models for the classification problem from machine learning. In classical case the decision tree is constructed in $O(hd(NM+N \log N))$ running time, where $M$ is a number of classes, $N$ is the size of a training data set, $d$ is a number of attributes of each element, $h$ is a tree height. Firstly, we improved the classical version, the running time of the new version is $O(h\cdot d\cdot N\log N)$. Secondly, we suggest a quantum version of this algorithm, which uses quantum subroutines like the amplitude amplification and the D{ü}rr-Høyer minimum search algorithms that are based on Grover's algorithm. The running time of the quantum algorithm is $O\big(h\cdot \sqrt{d}\log d \cdot N \log N\big)$ that is better than complexity of the classical algorithm.
