Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Solt Kovács; Housen Li; Lorenz Haubner; Axel Munk; Peter Bühlmann

Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Solt Kovács, Housen Li, Lorenz Haubner, Axel Munk, Peter Bühlmann

Abstract

Change point estimation is often formulated as a search for the maximum of a gain function describing improved fits when segmenting the data. Searching through all candidates requires $O(n)$ evaluations of the gain function for an interval with $n$ observations. If each evaluation is computationally demanding (e.g. in high-dimensional models), this can become infeasible. Instead, we propose optimistic search methods with $O(\log n)$ evaluations exploiting specific structure of the gain function. Towards solid understanding of our strategy, we investigate in detail the $p$-dimensional Gaussian changing means setup, including high-dimensional scenarios. For some of our proposals, we prove asymptotic minimax optimality for detecting change points and derive their asymptotic localization rate. These rates (up to a possible log factor) are optimal for the univariate and multivariate scenarios, and are by far the fastest in the literature under the weakest possible detection condition on the signal-to-noise ratio in the high-dimensional scenario. Computationally, our proposed methodology has the worst case complexity of $O(np)$, which can be improved to be sublinear in $n$ if some a-priori knowledge on the length of the shortest segment is available. Our search strategies generalize far beyond the theoretically analyzed setup. We illustrate, as an example, massive computational speedup in change point detection for high-dimensional Gaussian graphical models.

Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Abstract

Change point estimation is often formulated as a search for the maximum of a gain function describing improved fits when segmenting the data. Searching through all candidates requires

evaluations of the gain function for an interval with

observations. If each evaluation is computationally demanding (e.g. in high-dimensional models), this can become infeasible. Instead, we propose optimistic search methods with

evaluations exploiting specific structure of the gain function. Towards solid understanding of our strategy, we investigate in detail the

-dimensional Gaussian changing means setup, including high-dimensional scenarios. For some of our proposals, we prove asymptotic minimax optimality for detecting change points and derive their asymptotic localization rate. These rates (up to a possible log factor) are optimal for the univariate and multivariate scenarios, and are by far the fastest in the literature under the weakest possible detection condition on the signal-to-noise ratio in the high-dimensional scenario. Computationally, our proposed methodology has the worst case complexity of

, which can be improved to be sublinear in

if some a-priori knowledge on the length of the shortest segment is available. Our search strategies generalize far beyond the theoretically analyzed setup. We illustrate, as an example, massive computational speedup in change point detection for high-dimensional Gaussian graphical models.

Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Abstract

Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (37)