Minimising changes to audit when updating decision trees
Anj Simmons, Scott Barnett, Anupam Chaudhuri, Sankhya Singh, Shangeetha Sivasothy
TL;DR
Updates to interpretable decision trees on new data are performed while minimizing auditing changes. The Keep-Regrow algorithm greedily chooses at each split between keeping the existing condition or regrowing that subtree, optimizing the joint loss $L(r_{t-1}, r_t, D_t) = f(r_t, D_t) + \alpha c(r_t) + \beta \Delta(r_{t-1}, r_t)$. Pruning is integrated after regrowth with Preserve Split and Terminate options to control tree complexity and auditability. Evaluations on six UCI datasets show Keep-Regrow achieves a favorable trade-off between accuracy, tree size, and audit similarity, with practical defaults $\alpha=5$ and $\beta=1$. The work positions itself among incremental decision trees and interpretability research, and outlines future work in human-in-the-loop testing and lifecycle integration.
Abstract
Interpretable models are important, but what happens when the model is updated on new training data? We propose an algorithm for updating a decision tree while minimising the number of changes to the tree that a human would need to audit. We achieve this via a greedy approach that incorporates the number of changes to the tree as part of the objective function. We compare our algorithm to existing methods and show that it sits in a sweet spot between final accuracy and number of changes to audit.
