Table of Contents
Fetching ...

An Algorithm for Optimal Partitioning of Data on an Interval

Brad Jackson, Jeffrey D. Scargle, David Barnes, Sundararajan Arabhi, Alina Alt, Peter Gioumousis, Elyus Gwin, Paungkaew Sangtrakulcharoen, Linda Tan, Tun Tao Tsai

TL;DR

The paper reframes many signal-processing tasks as finding an optimal partition of an interval using an additive block fitness $V({\bf P})=\sum_m g(B_m)$ over discretized data cells, and proves an $O(N^2)$ dynamic-programming algorithm that yields the exact global optimum and automatically determines the number of segments. By exploiting the principle of optimality, the method computes $\text{opt}(n+1)=\max_j\{\text{opt}(j-1)+\text{end}(j,n+1)\}$ with $\text{end}(j,n+1)=g(B_{j,n+1})$, then backtracks using $\text{lastchange}$ to recover block boundaries. The approach applies to a wide range of 1D segmented models (e.g., piecewise-constant Poisson histograms, density estimation, signal detection) and extends to higher dimensions, offering real-time change-point detection and automatic model-order selection without explicit smoothing. Its real-time capability, flexibility, and exact optimality make it a principled tool for detection, segmentation, and clustering tasks across signal-processing and data-mining applications.

Abstract

Many signal processing problems can be solved by maximizing the fitness of a segmented model over all possible partitions of the data interval. This letter describes a simple but powerful algorithm that searches the exponentially large space of partitions of $N$ data points in time $O(N^2)$. The algorithm is guaranteed to find the exact global optimum, automatically determines the model order (the number of segments), has a convenient real-time mode, can be extended to higher dimensional data spaces, and solves a surprising variety of problems in signal detection and characterization, density estimation, cluster analysis and classification.

An Algorithm for Optimal Partitioning of Data on an Interval

TL;DR

The paper reframes many signal-processing tasks as finding an optimal partition of an interval using an additive block fitness over discretized data cells, and proves an dynamic-programming algorithm that yields the exact global optimum and automatically determines the number of segments. By exploiting the principle of optimality, the method computes with , then backtracks using to recover block boundaries. The approach applies to a wide range of 1D segmented models (e.g., piecewise-constant Poisson histograms, density estimation, signal detection) and extends to higher dimensions, offering real-time change-point detection and automatic model-order selection without explicit smoothing. Its real-time capability, flexibility, and exact optimality make it a principled tool for detection, segmentation, and clustering tasks across signal-processing and data-mining applications.

Abstract

Many signal processing problems can be solved by maximizing the fitness of a segmented model over all possible partitions of the data interval. This letter describes a simple but powerful algorithm that searches the exponentially large space of partitions of data points in time . The algorithm is guaranteed to find the exact global optimum, automatically determines the model order (the number of segments), has a convenient real-time mode, can be extended to higher dimensional data spaces, and solves a surprising variety of problems in signal detection and characterization, density estimation, cluster analysis and classification.

Paper Structure

This paper contains 4 sections, 2 theorems, 5 equations.

Key Result

Theorem 1

Let ${\bf P}^{\hbox{\small max}}$ be an optimal partition of $I$ and ${\bf P}_{1} = \{ B_{m}, m \in a \}$ be any subset of the blocks of ${\bf P}^{\hbox{\small max}}$. Then ${\bf P}_1$ is an optimal partition of the part of $I$ it covers, namely $I_1 = \bigcup\limits_{m \in a} B_{m}$.

Theorems & Definitions (2)

  • Theorem 1: Principle of Optimality
  • Theorem 2