Finite Sample Complexity Analysis of Binary Segmentation

Toby Dylan Hocking

Finite Sample Complexity Analysis of Binary Segmentation

Toby Dylan Hocking

TL;DR

New methods for analyzing the time and space complexity of binary segmentation for a given finite data and minimum segment length parameter are described and empirical analysis of real data suggests that binary segmentation is often close to optimal speed in practice.

Abstract

Binary segmentation is the classic greedy algorithm which recursively splits a sequential data set by optimizing some loss or likelihood function. Binary segmentation is widely used for changepoint detection in data sets measured over space or time, and as a sub-routine for decision tree learning. In theory it should be extremely fast for $N$ data and $K$ splits, $O(N K)$ in the worst case, and $O(N \log K)$ in the best case. In this paper we describe new methods for analyzing the time and space complexity of binary segmentation for a given finite $N$, $K$, and minimum segment length parameter. First, we describe algorithms that can be used to compute the best and worst case number of splits the algorithm must consider. Second, we describe synthetic data that achieve the best and worst case and which can be used to test for correct implementation of the algorithm. Finally, we provide an empirical analysis of real data which suggests that binary segmentation is often close to optimal speed in practice.

Finite Sample Complexity Analysis of Binary Segmentation

TL;DR

Abstract

data and

splits,

in the worst case, and

in the best case. In this paper we describe new methods for analyzing the time and space complexity of binary segmentation for a given finite

, and minimum segment length parameter. First, we describe algorithms that can be used to compute the best and worst case number of splits the algorithm must consider. Second, we describe synthetic data that achieve the best and worst case and which can be used to test for correct implementation of the algorithm. Finally, we provide an empirical analysis of real data which suggests that binary segmentation is often close to optimal speed in practice.

Finite Sample Complexity Analysis of Binary Segmentation

TL;DR

Abstract

Finite Sample Complexity Analysis of Binary Segmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (2)