Table of Contents
Fetching ...

Learning Small Decision Trees with Few Outliers: A Parameterized Perspective

Harmender Gahlawat, Meirav Zehavi

TL;DR

This paper examines the parameterized complexity of learning small decision trees under a tolerance for misclassified examples, introducing DTSO and DTDO as outlier-tolerant variants that seek small trees while allowing up to t errors. It proves W[1]-hardness for DTSO under s+δ_max and for DTDO under d+δ_max, and shows fixed-parameter tractability when including t as a parameter, extending prior DTS/DTD results. The work also analyzes kernelization, proving a trivial polynomial kernel for DTSO/DTDO when parameterized by D_max+|F| with fixed δ_max, while establishing incompressibility for DTD parameterized by d+|F| even for δ_max ≤ 3. These results clarify the limits of FPT approaches and kernelization for outlier-tolerant decision-tree learning and motivate future work on Pareto-optimal trade-offs and alternate parameterizations. Overall, the paper advances theoretical understanding of when small, robust decision trees are computationally feasible and under which conditions preprocessing (kernelization) is unlikely to help.

Abstract

Decision trees are a fundamental tool in machine learning for representing, classifying, and generalizing data. It is desirable to construct ``small'' decision trees, by minimizing either the \textit{size} ($s$) or the \textit{depth} $(d)$ of the \textit{decision tree} (\textsc{DT}). Recently, the parameterized complexity of \textsc{Decision Tree Learning} has attracted a lot of attention. We consider a generalization of \textsc{Decision Tree Learning} where given a \textit{classification instance} $E$ and an integer $t$, the task is to find a ``small'' \textsc{DT} that disagrees with $E$ in at most $t$ examples. We consider two problems: \textsc{DTSO} and \textsc{DTDO}, where the goal is to construct a \textsc{DT} minimizing $s$ and $d$, respectively. We first establish that both \textsc{DTSO} and \textsc{DTDO} are W[1]-hard when parameterized by $s+δ_{max}$ and $d+δ_{max}$, respectively, where $δ_{max}$ is the maximum number of features in which two differently labeled examples can differ. We complement this result by showing that these problems become \textsc{FPT} if we include the parameter $t$. We also consider the kernelization complexity of these problems and establish several positive and negative results for both \textsc{DTSO} and \textsc{DTDO}.

Learning Small Decision Trees with Few Outliers: A Parameterized Perspective

TL;DR

This paper examines the parameterized complexity of learning small decision trees under a tolerance for misclassified examples, introducing DTSO and DTDO as outlier-tolerant variants that seek small trees while allowing up to t errors. It proves W[1]-hardness for DTSO under s+δ_max and for DTDO under d+δ_max, and shows fixed-parameter tractability when including t as a parameter, extending prior DTS/DTD results. The work also analyzes kernelization, proving a trivial polynomial kernel for DTSO/DTDO when parameterized by D_max+|F| with fixed δ_max, while establishing incompressibility for DTD parameterized by d+|F| even for δ_max ≤ 3. These results clarify the limits of FPT approaches and kernelization for outlier-tolerant decision-tree learning and motivate future work on Pareto-optimal trade-offs and alternate parameterizations. Overall, the paper advances theoretical understanding of when small, robust decision trees are computationally feasible and under which conditions preprocessing (kernelization) is unlikely to help.

Abstract

Decision trees are a fundamental tool in machine learning for representing, classifying, and generalizing data. It is desirable to construct ``small'' decision trees, by minimizing either the \textit{size} () or the \textit{depth} of the \textit{decision tree} (\textsc{DT}). Recently, the parameterized complexity of \textsc{Decision Tree Learning} has attracted a lot of attention. We consider a generalization of \textsc{Decision Tree Learning} where given a \textit{classification instance} and an integer , the task is to find a ``small'' \textsc{DT} that disagrees with in at most examples. We consider two problems: \textsc{DTSO} and \textsc{DTDO}, where the goal is to construct a \textsc{DT} minimizing and , respectively. We first establish that both \textsc{DTSO} and \textsc{DTDO} are W[1]-hard when parameterized by and , respectively, where is the maximum number of features in which two differently labeled examples can differ. We complement this result by showing that these problems become \textsc{FPT} if we include the parameter . We also consider the kernelization complexity of these problems and establish several positive and negative results for both \textsc{DTSO} and \textsc{DTDO}.

Paper Structure

This paper contains 12 sections, 22 theorems, 4 figures.

Key Result

Theorem 1.1

DTDO and DTSO are $W[1]$-hard when parameterized by $d+\delta_{max}$ and $s+\delta_{max}$, respectively. Further, they are $W[1]$-hard parameterized by $s$ and $d$, respectively, even if $\delta_{max} \leq 3$.

Figures (4)

  • Figure 1: Here $(a)$ is the graph $G$ and $(b)$ is the $\mathsf{CI}$$E'$ corresponding to $G$. In $(c)$, we illustrate an example for constructing $E$ from $E'$ for $\eta =2$.
  • Figure 2: Illustration of $T$ corresponding to a set $S= \{u_1,\ldots, u_k\}$. Here, for $i\in [k]$, $v_i$ is a test node with $f(v_i)=u_i$ and $\lambda(v_i) = 0$. Moreover, the right child $l_i$ of $v_i$ is a negative leaf, and the left child $l'$ of $v_k$ is a positive leaf.
  • Figure 3: The construction of $E$ from $\overline{E}(\mathcal{I})$ and $E(\mathcal{I}_j)$, for $j\in [N]$.
  • Figure 4: The construction of $T"_l$ from $T_u$.

Theorems & Definitions (39)

  • Theorem 1.1
  • Theorem 1.2
  • Theorem 1.3
  • Theorem 1.4
  • Proposition 2.1: ordyniak
  • Proposition 2.3: RBDSIncompressibility
  • Proposition 3.1: pvcHard
  • proof
  • Lemma 3.4
  • proof
  • ...and 29 more