Table of Contents
Fetching ...

Boosted Trees on a Diet: Compact Models for Resource-Constrained Devices

Jan Stenkamp, Nina Herrmann, Benjamin Karic, Stefan Oehmcke, Fabian Gieseke

TL;DR

The paper tackles the challenge of running boosted decision trees on memory-constrained IoT devices. It introduces toad, a framework that jointly optimizes training-time memory efficiency and a pointer-free, bit-wise memory layout to compress ensembles by promoting reuse of features, thresholds, and leaves. The key contributions are a linear reuse penalty for features and thresholds, a global threshold/feature map, and a global leaf values array, collectively achieving 4–16× memory reductions with little or no loss in predictive performance across eight datasets. This enables edge analytics and real-time decisions on tiny devices, supporting remote monitoring and autonomous operation in energy-constrained environments.

Abstract

Deploying machine learning models on compute-constrained devices has become a key building block of modern IoT applications. In this work, we present a compression scheme for boosted decision trees, addressing the growing need for lightweight machine learning models. Specifically, we provide techniques for training compact boosted decision tree ensembles that exhibit a reduced memory footprint by rewarding, among other things, the reuse of features and thresholds during training. Our experimental evaluation shows that models achieved the same performance with a compression ratio of 4-16x compared to LightGBM models using an adapted training process and an alternative memory layout. Once deployed, the corresponding IoT devices can operate independently of constant communication or external energy supply, and, thus, autonomously, requiring only minimal computing power and energy. This capability opens the door to a wide range of IoT applications, including remote monitoring, edge analytics, and real-time decision making in isolated or power-limited environments.

Boosted Trees on a Diet: Compact Models for Resource-Constrained Devices

TL;DR

The paper tackles the challenge of running boosted decision trees on memory-constrained IoT devices. It introduces toad, a framework that jointly optimizes training-time memory efficiency and a pointer-free, bit-wise memory layout to compress ensembles by promoting reuse of features, thresholds, and leaves. The key contributions are a linear reuse penalty for features and thresholds, a global threshold/feature map, and a global leaf values array, collectively achieving 4–16× memory reductions with little or no loss in predictive performance across eight datasets. This enables edge analytics and real-time decisions on tiny devices, supporting remote monitoring and autonomous operation in energy-constrained environments.

Abstract

Deploying machine learning models on compute-constrained devices has become a key building block of modern IoT applications. In this work, we present a compression scheme for boosted decision trees, addressing the growing need for lightweight machine learning models. Specifically, we provide techniques for training compact boosted decision tree ensembles that exhibit a reduced memory footprint by rewarding, among other things, the reuse of features and thresholds during training. Our experimental evaluation shows that models achieved the same performance with a compression ratio of 4-16x compared to LightGBM models using an adapted training process and an alternative memory layout. Once deployed, the corresponding IoT devices can operate independently of constant communication or external energy supply, and, thus, autonomously, requiring only minimal computing power and energy. This capability opens the door to a wide range of IoT applications, including remote monitoring, edge analytics, and real-time decision making in isolated or power-limited environments.

Paper Structure

This paper contains 24 sections, 9 equations, 25 figures, 1 table.

Figures (25)

  • Figure 1: A machine learning model (decision tree) on a microcontroller processes multi-sensor data locally and transmits only relevant events, reducing energy costs; the decision tree must have a minimal compute and memory footprint.
  • Figure 2: High-level sketch of the memory layout used to store an ensemble of boosted decision trees. The first part stores some metadata, such as the number $K$ of boosted trees or the maximum depth of all trees. The following three parts encode the used features, thresholds, and leaf values. Finally, the references to the features and thresholds for the individual trees are stored.
  • Figure 3: Illustration of a bit-wise encoding of a boosted tree ensemble with two trees. Each tree is stored in a bit-wise manner, with each internal node storing a reference to a feature index and a reference to a threshold index. For instance, for the left child $n_2$ of the root of the tree $t_1$, a reference to feature $f_2$ is stored along with a reference to the associated threshold $\mu_{2}^2$. For feature $f_2$, there are two thresholds used by the entire ensemble, namely $\mu_{1}^2$ and $\mu_{2}^2$, which can be used by any node of any of the trees (e.⁠[4] ⁠[4]g.$t_2$ in node $n_3$). Accordingly, the leaf values stored in the leaves of the trees are shared (e.⁠[4] ⁠[4]g. $v_4$ is used in leaf $l_4$ of tree $t_1$ and leaf $l_1$ of $t_2$) and stored in one array. Since the bit-size of the thresholds varies, additional metadata is stored in the Feature & Threshold Mapping table/array. For instance, there are two thresholds for feature $f_1$ of bit-size 2 (i.⁠[4] ⁠[4]e., four different values), whereas there are two 1-bit thresholds for feature $f_2$.
  • Figure 4: Accuracy vs. memory (KB) for toad and baselines. Each point shows the best model performance at a given memory limit from the hyperparameter analysis.
  • Figure 5: Model performance on California Housing with a 1 KB memory limit under varying penalties.
  • ...and 20 more figures