Table of Contents
Fetching ...

Output-Constrained Decision Trees

Hüseyin Tunç, Doğanay Özese, Ş. İlker Birbil, Donato Maragno, Marco Caserta, Mustafa Baydoğan

TL;DR

This work tackles the challenge of producing feasible multi-target predictions by embedding output constraints directly into decision-tree training. It introduces three OCRT approaches—M-OCRT (MIP-based), E-OCRT (enumerative split search with constrained prediction), and EP-OCRT (post-hoc constrained correction)—and extends them to ensemble methods under convex feasible sets. Through synthetic and hierarchical time series experiments, the authors show that enforcing feasibility improves decision-making in downstream optimization tasks, with E-OCRT generally offering the best accuracy and M-OCRT providing flexibility at higher computational cost. The study also discusses generalized losses and optimization-with-constraint-learning contexts, highlighting the practical impact for HTS forecasting, inventory management, and resource allocation, while noting scalability and convexity assumptions as key considerations.

Abstract

Incorporating domain-specific constraints into machine learning models is essential for generating predictions that are both accurate and feasible in real-world applications. This paper introduces new methods for training Output-Constrained Regression Trees (OCRT), addressing the limitations of traditional decision trees in constrained multi-target regression tasks. We propose three approaches: M-OCRT, which uses split-based mixed integer programming to enforce constraints; E-OCRT, which employs an exhaustive search for optimal splits and solves constrained prediction problems at each decision node; and EP-OCRT, which applies post-hoc constrained optimization to tree predictions. To illustrate their potential uses in ensemble learning, we also introduce a random forest framework working under convex feasible sets. We validate the proposed methods through a computational study both on synthetic and industry-driven hierarchical time series datasets. Our results demonstrate that imposing constraints on decision tree training results in accurate and feasible predictions.

Output-Constrained Decision Trees

TL;DR

This work tackles the challenge of producing feasible multi-target predictions by embedding output constraints directly into decision-tree training. It introduces three OCRT approaches—M-OCRT (MIP-based), E-OCRT (enumerative split search with constrained prediction), and EP-OCRT (post-hoc constrained correction)—and extends them to ensemble methods under convex feasible sets. Through synthetic and hierarchical time series experiments, the authors show that enforcing feasibility improves decision-making in downstream optimization tasks, with E-OCRT generally offering the best accuracy and M-OCRT providing flexibility at higher computational cost. The study also discusses generalized losses and optimization-with-constraint-learning contexts, highlighting the practical impact for HTS forecasting, inventory management, and resource allocation, while noting scalability and convexity assumptions as key considerations.

Abstract

Incorporating domain-specific constraints into machine learning models is essential for generating predictions that are both accurate and feasible in real-world applications. This paper introduces new methods for training Output-Constrained Regression Trees (OCRT), addressing the limitations of traditional decision trees in constrained multi-target regression tasks. We propose three approaches: M-OCRT, which uses split-based mixed integer programming to enforce constraints; E-OCRT, which employs an exhaustive search for optimal splits and solves constrained prediction problems at each decision node; and EP-OCRT, which applies post-hoc constrained optimization to tree predictions. To illustrate their potential uses in ensemble learning, we also introduce a random forest framework working under convex feasible sets. We validate the proposed methods through a computational study both on synthetic and industry-driven hierarchical time series datasets. Our results demonstrate that imposing constraints on decision tree training results in accurate and feasible predictions.
Paper Structure (21 sections, 3 theorems, 19 equations, 3 figures, 6 tables)

This paper contains 21 sections, 3 theorems, 19 equations, 3 figures, 6 tables.

Key Result

proposition 1

When eqn:gain serves as the objective function for E-OCRT and the loss function is the average error over the samples (e.g., MSE or MAD), M-OCRT and E-OCRT yield the same regression tree.

Figures (3)

  • Figure 1: The trade-off between accuracy and feasibility. The red circle in the right plot marks the depth of the tree which returns the lowest mean squared error value for the test set. The corresponding red bar in the left plot shows the corresponding percentage of infeasible predictions. The sample size is 4000.
  • Figure 2: MIP-based Training vs. Enumerative Training.
  • Figure 3: Average training times

Theorems & Definitions (4)

  • proposition 1
  • theorem 1
  • proposition 2
  • proof