Table of Contents
Fetching ...

Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Solt Kovács, Housen Li, Lorenz Haubner, Axel Munk, Peter Bühlmann

Abstract

Change point estimation is often formulated as a search for the maximum of a gain function describing improved fits when segmenting the data. Searching through all candidates requires $O(n)$ evaluations of the gain function for an interval with $n$ observations. If each evaluation is computationally demanding (e.g. in high-dimensional models), this can become infeasible. Instead, we propose optimistic search methods with $O(\log n)$ evaluations exploiting specific structure of the gain function. Towards solid understanding of our strategy, we investigate in detail the $p$-dimensional Gaussian changing means setup, including high-dimensional scenarios. For some of our proposals, we prove asymptotic minimax optimality for detecting change points and derive their asymptotic localization rate. These rates (up to a possible log factor) are optimal for the univariate and multivariate scenarios, and are by far the fastest in the literature under the weakest possible detection condition on the signal-to-noise ratio in the high-dimensional scenario. Computationally, our proposed methodology has the worst case complexity of $O(np)$, which can be improved to be sublinear in $n$ if some a-priori knowledge on the length of the shortest segment is available. Our search strategies generalize far beyond the theoretically analyzed setup. We illustrate, as an example, massive computational speedup in change point detection for high-dimensional Gaussian graphical models.

Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Abstract

Change point estimation is often formulated as a search for the maximum of a gain function describing improved fits when segmenting the data. Searching through all candidates requires evaluations of the gain function for an interval with observations. If each evaluation is computationally demanding (e.g. in high-dimensional models), this can become infeasible. Instead, we propose optimistic search methods with evaluations exploiting specific structure of the gain function. Towards solid understanding of our strategy, we investigate in detail the -dimensional Gaussian changing means setup, including high-dimensional scenarios. For some of our proposals, we prove asymptotic minimax optimality for detecting change points and derive their asymptotic localization rate. These rates (up to a possible log factor) are optimal for the univariate and multivariate scenarios, and are by far the fastest in the literature under the weakest possible detection condition on the signal-to-noise ratio in the high-dimensional scenario. Computationally, our proposed methodology has the worst case complexity of , which can be improved to be sublinear in if some a-priori knowledge on the length of the shortest segment is available. Our search strategies generalize far beyond the theoretically analyzed setup. We illustrate, as an example, massive computational speedup in change point detection for high-dimensional Gaussian graphical models.

Paper Structure

This paper contains 31 sections, 13 theorems, 164 equations, 6 figures, 1 table, 4 algorithms.

Key Result

Theorem 3.1

Under Model Gaussian_setup with a single change point, i.e. $\kappa =1$, we assume that the minimal segment length $\lambda$ and the minimal jump size $\delta$ satisfy for some large enough constant $C_0$. Let $\hat{\tau} = \hat{t}_{(0,n]}/n$ be the estimated change point by OS (Alg:OS) on $(0,n]$. Then:

Figures (6)

  • Figure 1: Finding a single change point with full grid search (black) and OS (red) in a $200\times 200$-dimensional covariance change example with underlying graphical lasso fits. OS starts with two initial evaluations marked by the two zeros and then evaluates further $14$ split points adaptively, in the order marked by the respective colored numbers shown in the zoomed in part b). The true underlying change point at observation $400$ is marked in green and the final candidate returned by OS at observation $423$ in blue. The overall maximum of the black gain curve found by full grid search is at observation $402$.
  • Figure 2: Naive optimistic search step $\mathrm{OS}(\tilde{l}, t, \tilde{r} \mid \nu, l, r)$ within the current leftover segment $(\tilde{l},\tilde{r}]\subseteq(l,r]$ given the previous evaluation at $t$ and step size $\nu$. As $\tilde{r}-t>t-\tilde{l}$, the new probe point $w$ is taken within $(t,\tilde{r}]$ as $w = \lceil \tilde{r} - (\tilde{r} - t)\nu \rceil$. Depending on the gain $G_{(l,r]}(w)$ vs. $G_{(l,r]}(t)$, one either continues with $\mathrm{OS}(t, w, \tilde{r} \mid \nu, l, r)$ (discarding the blue part) or $\mathrm{OS}(\tilde{l}, t, w \mid \nu, l, r)$ (discarding the red part).
  • Figure 3: Results on Example 6.2. Pairwise plots of Hausdorff distances of the locations of the best $11$ change point candidates (with greedy selection) compared to the true ones in $100$ simulations for SeedBS (decay $a=1/\sqrt{2}$) with various minimal segment length constraints and full grid search in each seeded interval (horizontal-axis) versus combined OS (vertical-axis, left) and OS (vertical-axis, right). The vertical and horizontal dashed lines indicate the average Hausdorff distancesfor the respective minimal segment length constraint and search method within the seeded intervals. Note the logarithmic scales on both axes.
  • Figure 4: Estimation performances (in terms of Hausdorff distance) and computational times on the changing Gaussian graphical model (\ref{['GGM_setup']}) from \ref{['setup:high-dim']} (based on 10 simulations) with setups (a) on the top and (b) on the bottom. The symbols differentiate between the basic algorithms and the colors indicate whether full grid search or optimistic search was used. The five point clouds for SeedBS and OSeedBS correspond to decay parameters $a=2^{-1}, 2^{-1/2}, 2^{-1/4}, 2^{-1/8}, 2^{-1/16}$ for the seeded intervals, while the five point clouds for WBS and OWBS correspond to $M = 100, 200, 400, 800, 1600$ random intervals. For all algorithms, the true number of change points (or the maximally many if not achieved) has been used .
  • Figure B1: Pairwise plots of found change points using different optimistic search methods (horizontal-axis) versus the ones returned by the full grid search (vertical-axis) for a noise level $\sigma = 1.5$ and $n=200$ (top) as well as $n={5000}$ (bottom) in $1000$ simulations from \ref{['setup:single_mean_shift']}. The vertical and horizontal lines indicate the location of the true change point at observation $100$.
  • ...and 1 more figures

Theorems & Definitions (37)

  • Theorem 3.1: Naive optimistic search
  • Theorem 3.2: Advanced optimistic search
  • Lemma 3.3
  • Definition 4.1: Seeded intervals; SeedBS
  • Theorem 4.2
  • Theorem 5.1: Single change point
  • Theorem 5.2: Multiple change points
  • Proposition 5.3
  • Example 6.1
  • Example 6.2
  • ...and 27 more