Table of Contents
Fetching ...

Vectorized Adaptive Histograms for Sparse Oblique Forests

Ariel Lubonja, Jungsang Yoon, Haoyin Xu, Yue Wan, Yilin Xu, Richard Stotz, Mathieu Guillame-Bert, Joshua T. Vogelstein, Randal Burns

TL;DR

A method for dynamically switching between histograms and sorting to find the best split and optimizing histogram construction using vector intrinsics, which speeds up training by 1.7-2.5x.

Abstract

Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types. However, they use more data and more compute than other tree ensembles because they create deep trees and need to sort or histogram linear combinations of data at runtime. We provide a method for dynamically switching between histograms and sorting to find the best split. We further optimize histogram construction using vector intrinsics. Evaluating this on large datasets, our optimizations speedup training by 1.7-2.5x compared to existing oblique forests and 1.5-2x compared to standard random forests. We also provide a GPU and hybrid CPU-GPU implementation.

Vectorized Adaptive Histograms for Sparse Oblique Forests

TL;DR

A method for dynamically switching between histograms and sorting to find the best split and optimizing histogram construction using vector intrinsics, which speeds up training by 1.7-2.5x.

Abstract

Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types. However, they use more data and more compute than other tree ensembles because they create deep trees and need to sort or histogram linear combinations of data at runtime. We provide a method for dynamically switching between histograms and sorting to find the best split. We further optimize histogram construction using vector intrinsics. Evaluating this on large datasets, our optimizations speedup training by 1.7-2.5x compared to existing oblique forests and 1.5-2x compared to standard random forests. We also provide a GPU and hybrid CPU-GPU implementation.
Paper Structure (15 sections, 8 figures, 4 tables)

This paper contains 15 sections, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Training runtime by tree depth on a dataset with 1M samples 4096 features. We compare exact splitting using sorting, approximate splitting using histograms, and our dynamic method that adaptively chooses between them.
  • Figure 2: Workflow at each tree node. Histogram splitting of a random linear combination of features requires sparse access in both rows and columns, computing a vector sum, building histograms and evaluating split boundaries.
  • Figure 3: Microbenchmarks to evaluate crossover points on a CPU machine (top) and GPU machine (bottom). Dynamic histograms outperform exact splits for $n>350$. GPUs have high startup costs and improve performance for $n>29000$.
  • Figure 4: Illustration of the selection of exact splitting and histogram splitting as a function of the number of active samples in a tree node with a breakeven point of 1300.
  • Figure 5: Comparative runtime of the different components of computation for histogram splitting. The dataset has 1M samples and 4096 features.
  • ...and 3 more figures

Theorems & Definitions (1)

  • proof