Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise

Lukasz Sztukiewicz; Jack Henry Good; Artur Dubrawski

Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise

Lukasz Sztukiewicz, Jack Henry Good, Artur Dubrawski

TL;DR

It is shown that loss correction and symmetric losses, both standard approaches, are not effective and it is argued that other directions need to be explored to improve the robustness of decision trees to label noise.

Abstract

In the real world, data is often noisy, affecting not only the quality of features but also the accuracy of labels. Current research on mitigating label errors stems primarily from advances in deep learning, and a gap exists in exploring interpretable models, particularly those rooted in decision trees. In this study, we investigate whether ideas from deep learning loss design can be applied to improve the robustness of decision trees. In particular, we show that loss correction and symmetric losses, both standard approaches, are not effective. We argue that other directions need to be explored to improve the robustness of decision trees to label noise.

Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise

TL;DR

Abstract

Paper Structure (7 sections, 2 theorems, 13 equations, 4 figures, 3 tables)

This paper contains 7 sections, 2 theorems, 13 equations, 4 figures, 3 tables.

Theory and proofs
Tree fitting as loss minimization
Proof of Theorem \ref{['thm:forward_correction']}
Backward loss correction for decision trees
Proof of Theorem \ref{['thm:symmetric_losses']}
Data sets used in experiments
Experimental results

Key Result

Theorem 1

For any loss function where the minimizing leaf value is the weighted mean, the loss value for a given tree structure is invariant to forward loss correction.

Figures (4)

Figure 1: Performance of Decision Tree, Extra Trees and Random Forest models on "wine" data set. We show forward and backward loss-corrected models as well models without loss correction measured by the weighted F1 score. Reported scores are the averages of ten fold cross-validation ploted with standard deviation.
Figure 2: Performance of Decision Tree forward and backward loss-corrected models as well models without loss correction measured by the weighted F1 score on six benchmarking data sets. Reported scores are the averages of ten fold cross-validation ploted with standard deviation.
Figure 3: Performance of Random Forest forward and backward loss-corrected models as well models without loss correction measured by the weighted F1 score on six benchmarking data sets. Reported scores are the averages of ten fold cross-validation ploted with standard deviation.
Figure 4: Performance of Extra Trees forward and backward loss-corrected models as well models without loss correction measured by the weighted F1 score on six benchmarking data sets. Reported scores are the averages of ten fold cross-validation ploted with standard deviation.

Theorems & Definitions (2)

Theorem 1
Theorem 2

Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise

TL;DR

Abstract

Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (2)