Table of Contents
Fetching ...

Neural Random Forest Imitation

Christoph Reinders, Bodo Rosenhahn

TL;DR

Neural Random Forest Imitation (NRFI) tackles data scarcity by implicitly transforming random forests into neural networks through imitation learning. It generates labeled data by analyzing RF decision boundaries and guided routing, then trains a small, differentiable network to mimic the forest's predictions. NRFI achieves RF-level accuracy with orders of magnitude fewer parameters and supports end-to-end optimization and warm-starting, outperforming direct RF-to-NN mappings in efficiency and scalability. The approach scales to complex classifiers and integrates into trainable pipelines, enabling rapid deployment in low-data regimes.

Abstract

We present Neural Random Forest Imitation - a novel approach for transforming random forests into neural networks. Existing methods propose a direct mapping and produce very inefficient architectures. In this work, we introduce an imitation learning approach by generating training data from a random forest and learning a neural network that imitates its behavior. This implicit transformation creates very efficient neural networks that learn the decision boundaries of a random forest. The generated model is differentiable, can be used as a warm start for fine-tuning, and enables end-to-end optimization. Experiments on several real-world benchmark datasets demonstrate superior performance, especially when training with very few training examples. Compared to state-of-the-art methods, we significantly reduce the number of network parameters while achieving the same or even improved accuracy due to better generalization.

Neural Random Forest Imitation

TL;DR

Neural Random Forest Imitation (NRFI) tackles data scarcity by implicitly transforming random forests into neural networks through imitation learning. It generates labeled data by analyzing RF decision boundaries and guided routing, then trains a small, differentiable network to mimic the forest's predictions. NRFI achieves RF-level accuracy with orders of magnitude fewer parameters and supports end-to-end optimization and warm-starting, outperforming direct RF-to-NN mappings in efficiency and scalability. The approach scales to complex classifiers and integrates into trainable pipelines, enabling rapid deployment in low-data regimes.

Abstract

We present Neural Random Forest Imitation - a novel approach for transforming random forests into neural networks. Existing methods propose a direct mapping and produce very inefficient architectures. In this work, we introduce an imitation learning approach by generating training data from a random forest and learning a neural network that imitates its behavior. This implicit transformation creates very efficient neural networks that learn the decision boundaries of a random forest. The generated model is differentiable, can be used as a warm start for fine-tuning, and enables end-to-end optimization. Experiments on several real-world benchmark datasets demonstrate superior performance, especially when training with very few training examples. Compared to state-of-the-art methods, we significantly reduce the number of network parameters while achieving the same or even improved accuracy due to better generalization.

Paper Structure

This paper contains 21 sections, 5 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Neural random forest imitation enables an implicit transformation of random forests into neural networks. Usually, data samples are propagated through the individual decision trees and the split decisions are evaluated during inference. We propose a method for generating input-target pairs by reversing this process and training a neural network that imitates the random forest. The resulting network is much smaller compared to current state-of-the-art methods, which directly map the random forest.
  • Figure 2: Overview of the data generation process from a decision tree. First, the class distribution information is propagated from the leaf nodes to the split nodes (a). Afterward, data samples are generated by guided routing (Section \ref{['covmap_sec_data_generation_from_tree']}) and modifying the data based on the split decisions (b). The weights for sampling the left or right child node are highlighted in orange.
  • Figure 3: Test accuracy depending on the network architecture (i.e., number of neurons in both hidden layers). Different datasets are shown per row, with an increasing number of training examples per class from left to right (indicated in parentheses). The red dashed line shows the accuracy of the random forest. NRFI with generated data is shown in orange and NRFI with generated and original data in blue. With increasing network capacity, NRFI is capable of imitating and even outperforming the random forest.
  • Figure 4: Comparison of the state-of-the-art and our proposed method for transforming random forests into neural networks. The closer a method is to the lower-left corner, the better it is (fewer number of network parameters and lower test error). For neural random forest imitation, different network architectures are shown. Note that the number of network parameters is shown on a logarithmic scale.
  • Figure 5: Probability distribution of the predicted confidences for different data generation settings on Soybean with $5$ (top) and $50$ samples per class (bottom). Generating data with different numbers of decision trees is visualized in the left column. Additionally, a comparison between random sampling (red), NRFI uniform (orange), and NRFI dynamic (green) is shown in the right column. By optimizing the decision tree sampling, NRFI dynamic automatically balances the confidences and generates the most diverse and evenly distributed data.
  • ...and 1 more figures