Classification via Two-Way Comparisons
Marek Chrobak, Neal E. Young
TL;DR
This work addresses the problem of computing minimum-cost two-way-comparison decision trees (2WDTs) for a weighted, ordered query set $Q$ partitioned into classes, where each query is classified using equality and less-than tests and the objective is to minimize $\sum_{q\in Q} w(q)\cdot \mathrm{depth}(q)$. The authors introduce a laminar-decision-tree (LDT) framework, prove an imbalance theorem and a bound on path structure via a generalized rotation, and show that some optimal tree is admissible. They then present a dynamic-programming algorithm with running time $O(n^3 m)$ (where $n=|Q|$ and $m=\sum_{c\in\mathcal{C}}|c|$) to compute minimum-cost 2WDTs, extendable to multi-class scenarios and to other inequality tests (e.g., $\le$). The results yield the first polynomial-time algorithm for minimum-cost 2WDTs and have practical implications for efficient dispatch and classification trees, with a clear path to extensions and deterministic implementations. The work also clarifies weight-handling issues through tie-breaking perturbations and provides a robust framework for laminar tests beyond the strictly two-test setting.
Abstract
Given a weighted, ordered query set $Q$ and a partition of $Q$ into classes, we study the problem of computing a minimum-cost decision tree that, given any query $q$ in $Q$, uses equality tests and less-than comparisons to determine the class to which $q$ belongs. Such a tree can be much smaller than a lookup table, and much faster and smaller than a conventional search tree. We give the first polynomial-time algorithm for the problem. The algorithm extends naturally to the setting where each query has multiple allowed classes.
