Boosted Trees on a Diet: Compact Models for Resource-Constrained Devices
Jan Stenkamp, Nina Herrmann, Benjamin Karic, Stefan Oehmcke, Fabian Gieseke
TL;DR
The paper tackles the challenge of running boosted decision trees on memory-constrained IoT devices. It introduces toad, a framework that jointly optimizes training-time memory efficiency and a pointer-free, bit-wise memory layout to compress ensembles by promoting reuse of features, thresholds, and leaves. The key contributions are a linear reuse penalty for features and thresholds, a global threshold/feature map, and a global leaf values array, collectively achieving 4–16× memory reductions with little or no loss in predictive performance across eight datasets. This enables edge analytics and real-time decisions on tiny devices, supporting remote monitoring and autonomous operation in energy-constrained environments.
Abstract
Deploying machine learning models on compute-constrained devices has become a key building block of modern IoT applications. In this work, we present a compression scheme for boosted decision trees, addressing the growing need for lightweight machine learning models. Specifically, we provide techniques for training compact boosted decision tree ensembles that exhibit a reduced memory footprint by rewarding, among other things, the reuse of features and thresholds during training. Our experimental evaluation shows that models achieved the same performance with a compression ratio of 4-16x compared to LightGBM models using an adapted training process and an alternative memory layout. Once deployed, the corresponding IoT devices can operate independently of constant communication or external energy supply, and, thus, autonomously, requiring only minimal computing power and energy. This capability opens the door to a wide range of IoT applications, including remote monitoring, edge analytics, and real-time decision making in isolated or power-limited environments.
