Table of Contents
Fetching ...

Optimal Decision Tree and Adaptive Submodular Ranking with Noisy Outcomes

Su Jia, Fatemeh Navidi, Viswanath Nagarajan, R. Ravi

TL;DR

This work studies Optimal Decision Tree with Noise (ODTN) and its generalization Adaptive Submodular Ranking with Noise (ASRN), addressing the challenge of persistent, noisy test outcomes. It develops a unified framework that connects ODTN to ASRN and to Submodular Ranking with Noise (SFRN), enabling noise-tolerant approximation guarantees across non-adaptive and adaptive settings. The results include a polynomial-time $O(\log \frac{1}{\varepsilon})$-approximation for non-adaptive instances, adaptive guarantees $O(\min\{c,r\} + \log \frac{m}{\varepsilon})$ under low noise, and a sparsity-based $O(m^{\alpha} + \log m \cdot OPT)$-type bound for high-noise, along with a robust Membership Oracle and a Stochastic Set Cover perspective. Experiments on toxic chemicals and linear classifiers show practical performance close to information-theoretic lower bounds, highlighting the approach’s potential for real-world noisy diagnostic and learning tasks.

Abstract

In pool-based active learning, the learner is given an unlabeled data set and aims to efficiently learn the unknown hypothesis by querying the labels of the data points. This can be formulated as the classical Optimal Decision Tree (ODT) problem: Given a set of tests, a set of hypotheses, and an outcome for each pair of test and hypothesis, our objective is to find a low-cost testing procedure (i.e., decision tree) that identifies the true hypothesis. This optimization problem has been extensively studied under the assumption that each test generates a deterministic outcome. However, in numerous applications, for example, clinical trials, the outcomes may be uncertain, which renders the ideas from the deterministic setting invalid. In this work, we study a fundamental variant of the ODT problem in which some test outcomes are noisy, even in the more general case where the noise is persistent, i.e., repeating a test gives the same noisy output. Our approximation algorithms provide guarantees that are nearly best possible and hold for the general case of a large number of noisy outcomes per test or per hypothesis where the performance degrades continuously with this number. We numerically evaluated our algorithms for identifying toxic chemicals and learning linear classifiers, and observed that our algorithms have costs very close to the information-theoretic minimum.

Optimal Decision Tree and Adaptive Submodular Ranking with Noisy Outcomes

TL;DR

This work studies Optimal Decision Tree with Noise (ODTN) and its generalization Adaptive Submodular Ranking with Noise (ASRN), addressing the challenge of persistent, noisy test outcomes. It develops a unified framework that connects ODTN to ASRN and to Submodular Ranking with Noise (SFRN), enabling noise-tolerant approximation guarantees across non-adaptive and adaptive settings. The results include a polynomial-time -approximation for non-adaptive instances, adaptive guarantees under low noise, and a sparsity-based -type bound for high-noise, along with a robust Membership Oracle and a Stochastic Set Cover perspective. Experiments on toxic chemicals and linear classifiers show practical performance close to information-theoretic lower bounds, highlighting the approach’s potential for real-world noisy diagnostic and learning tasks.

Abstract

In pool-based active learning, the learner is given an unlabeled data set and aims to efficiently learn the unknown hypothesis by querying the labels of the data points. This can be formulated as the classical Optimal Decision Tree (ODT) problem: Given a set of tests, a set of hypotheses, and an outcome for each pair of test and hypothesis, our objective is to find a low-cost testing procedure (i.e., decision tree) that identifies the true hypothesis. This optimization problem has been extensively studied under the assumption that each test generates a deterministic outcome. However, in numerous applications, for example, clinical trials, the outcomes may be uncertain, which renders the ideas from the deterministic setting invalid. In this work, we study a fundamental variant of the ODT problem in which some test outcomes are noisy, even in the more general case where the noise is persistent, i.e., repeating a test gives the same noisy output. Our approximation algorithms provide guarantees that are nearly best possible and hold for the general case of a large number of noisy outcomes per test or per hypothesis where the performance degrades continuously with this number. We numerically evaluated our algorithms for identifying toxic chemicals and learning linear classifiers, and observed that our algorithms have costs very close to the information-theoretic minimum.
Paper Structure (39 sections, 26 theorems, 54 equations, 1 figure, 6 tables, 5 algorithms)

This paper contains 39 sections, 26 theorems, 54 equations, 1 figure, 6 tables, 5 algorithms.

Key Result

Theorem 3

There is a polynomial-time algorithm whose cost is $O(\log \frac{1}{\varepsilon})$ times the optimum for any SFR instance with separability parameter $\varepsilon>0$.

Figures (1)

  • Figure 1: Connections between related problems: Edges represent (direct) reductions between problems. The test cover problem de2003approximation , which was not mentioned so far, is essentially a non-adaptive version of the ODT problem, and hence can be reduced to the SFR problem. We highlight the new problems introduced in this work in red color.

Theorems & Definitions (42)

  • Definition 1: Cover Time and Cost
  • Definition 2: Separability
  • Theorem 3: azar2011ranking
  • Definition 4: Adaptive Policy
  • Definition 5: Cover Time, Adaptive Setting
  • Theorem 6: navidi2016adaptive
  • Definition 7: Consistency of Response Vectors
  • Definition 8: Conditional Cover Time
  • Definition 9: Cost of a Policy
  • Definition 10: Expanded Scenarios
  • ...and 32 more