Table of Contents
Fetching ...

Minimising changes to audit when updating decision trees

Anj Simmons, Scott Barnett, Anupam Chaudhuri, Sankhya Singh, Shangeetha Sivasothy

TL;DR

Updates to interpretable decision trees on new data are performed while minimizing auditing changes. The Keep-Regrow algorithm greedily chooses at each split between keeping the existing condition or regrowing that subtree, optimizing the joint loss $L(r_{t-1}, r_t, D_t) = f(r_t, D_t) + \alpha c(r_t) + \beta \Delta(r_{t-1}, r_t)$. Pruning is integrated after regrowth with Preserve Split and Terminate options to control tree complexity and auditability. Evaluations on six UCI datasets show Keep-Regrow achieves a favorable trade-off between accuracy, tree size, and audit similarity, with practical defaults $\alpha=5$ and $\beta=1$. The work positions itself among incremental decision trees and interpretability research, and outlines future work in human-in-the-loop testing and lifecycle integration.

Abstract

Interpretable models are important, but what happens when the model is updated on new training data? We propose an algorithm for updating a decision tree while minimising the number of changes to the tree that a human would need to audit. We achieve this via a greedy approach that incorporates the number of changes to the tree as part of the objective function. We compare our algorithm to existing methods and show that it sits in a sweet spot between final accuracy and number of changes to audit.

Minimising changes to audit when updating decision trees

TL;DR

Updates to interpretable decision trees on new data are performed while minimizing auditing changes. The Keep-Regrow algorithm greedily chooses at each split between keeping the existing condition or regrowing that subtree, optimizing the joint loss . Pruning is integrated after regrowth with Preserve Split and Terminate options to control tree complexity and auditability. Evaluations on six UCI datasets show Keep-Regrow achieves a favorable trade-off between accuracy, tree size, and audit similarity, with practical defaults and . The work positions itself among incremental decision trees and interpretability research, and outlines future work in human-in-the-loop testing and lifecycle integration.

Abstract

Interpretable models are important, but what happens when the model is updated on new training data? We propose an algorithm for updating a decision tree while minimising the number of changes to the tree that a human would need to audit. We achieve this via a greedy approach that incorporates the number of changes to the tree as part of the objective function. We compare our algorithm to existing methods and show that it sits in a sweet spot between final accuracy and number of changes to audit.
Paper Structure (20 sections, 10 equations, 1 figure, 1 table)

This paper contains 20 sections, 10 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Initial decision tree at t=0 (left), and updated tree at t=1 (right) produced by our algorithm. Note how our algorithm 'keeps' Node 1 and Node 2 to minimise changes to audit and 'regrows' Node 3 for improved accuracy.