Table of Contents
Fetching ...

SilentWood: Private Inference Over Gradient-Boosting Decision Forests

Ronny Ko, Abdelkarim Kati, Robin Geelen, Rasoul Akhavan Mahdavi, Byoungwoo Yoon, Jongho Shin, Igor Moroz, Anton Jappinen, Zhiqiang Lin, Makoto Onizuka, Florian Kerschbaum

TL;DR

SilentWood delivers private inference over gradient-boosting decision forests by combining computation clustering, a novel Blind Code Conversion protocol, and ciphertext compression to reduce redundant FHE work and communication. It leverages constant-weight encoding for efficient threshold comparisons and SumPath-based score aggregation to keep circuit depth low while enabling SIMD. Empirical results show large speedups over private baselines (up to tens of times faster) and substantial reductions in data exchanged, while maintaining accuracy through validation-guided clustering. The approach is broadly applicable to FHE-based private inference beyond XGBoost, and it offers practical privacy-preserving inference with scalable performance for large forest models.

Abstract

Gradient boosting decision forests, used by XGBoost or AdaBoost, offer higher accuracy and lower training times than decision trees for large datasets. Protocols for private inference over decision trees can be used to preserve the privacy of the input data as well as the privacy of the trees. However, naively extending private inference over decision trees to private inference over decision forests by replicating the protocols leads to impractical running times. In this paper, we propose an efficient private decision inference protocol using homomorphic encryption. We present several optimizations that identify and then remove (approximate) duplication between the trees in a forest, thereby achieving significant improvements in communication and computation cost over the naive approach. To the best of our knowledge, we present the first private inference protocol for highly scalable gradient boosting decision forests. Our protocol's (SilentWood) inference time is faster than the baseline of parallel running the RCC-PDTE protocol by Mahdavi et al. by up to 42.5x, and faster than Zama's Concrete ML XGBoost by up to 27.8x, and faster than SoK-GGG's two-party garbled circuit protocol by 2.94x.

SilentWood: Private Inference Over Gradient-Boosting Decision Forests

TL;DR

SilentWood delivers private inference over gradient-boosting decision forests by combining computation clustering, a novel Blind Code Conversion protocol, and ciphertext compression to reduce redundant FHE work and communication. It leverages constant-weight encoding for efficient threshold comparisons and SumPath-based score aggregation to keep circuit depth low while enabling SIMD. Empirical results show large speedups over private baselines (up to tens of times faster) and substantial reductions in data exchanged, while maintaining accuracy through validation-guided clustering. The approach is broadly applicable to FHE-based private inference beyond XGBoost, and it offers practical privacy-preserving inference with scalable performance for large forest models.

Abstract

Gradient boosting decision forests, used by XGBoost or AdaBoost, offer higher accuracy and lower training times than decision trees for large datasets. Protocols for private inference over decision trees can be used to preserve the privacy of the input data as well as the privacy of the trees. However, naively extending private inference over decision trees to private inference over decision forests by replicating the protocols leads to impractical running times. In this paper, we propose an efficient private decision inference protocol using homomorphic encryption. We present several optimizations that identify and then remove (approximate) duplication between the trees in a forest, thereby achieving significant improvements in communication and computation cost over the naive approach. To the best of our knowledge, we present the first private inference protocol for highly scalable gradient boosting decision forests. Our protocol's (SilentWood) inference time is faster than the baseline of parallel running the RCC-PDTE protocol by Mahdavi et al. by up to 42.5x, and faster than Zama's Concrete ML XGBoost by up to 27.8x, and faster than SoK-GGG's two-party garbled circuit protocol by 2.94x.

Paper Structure

This paper contains 27 sections, 1 equation, 9 figures, 14 tables, 5 algorithms.

Figures (9)

  • Figure 1: Computation of MultiplyPath and SumPath
  • Figure 2: Edge formulas for SumPath and MultiplyPath
  • Figure 3: An examples of node clustering
  • Figure 4: An example of BCC for XGBoost scoring.
  • Figure 5: An example of repetitive data encoding
  • ...and 4 more figures