On the Robustness of the Successive Projection Algorithm

Giovanni Barbarino; Nicolas Gillis

On the Robustness of the Successive Projection Algorithm

Giovanni Barbarino, Nicolas Gillis

TL;DR

This work analyzes the robustness of the successive projection algorithm (SPA) for separable SSMF under noise, formalizing how the conditioning of the vertex matrix $W$ governs recovery error. It introduces tighter bounds for the first SPA step, extends improved guarantees to the rank-2 and certain translated variants (T-SPA), and proves tightness results for SPA, SPA$^2$, and MVIE-based preconditioning. A novel translation+lifting variant (TL-SPA) is proposed to reduce conditioning and improve practical robustness, with validated gains on synthetic datasets including adversarial middle-point noise and rank-deficient scenarios. Overall, the results provide both theoretical guarantees and practical guidance for selecting SPA variants and preprocessing to reliably recover latent simplex vertices in noisy environments.

Abstract

The successive projection algorithm (SPA) is a workhorse algorithm to learn the $r$ vertices of the convex hull of a set of $(r-1)$-dimensional data points, a.k.a. a latent simplex, which has numerous applications in data science. In this paper, we revisit the robustness to noise of SPA and several of its variants. In particular, when $r \geq 3$, we prove the tightness of the existing error bounds for SPA and for two more robust preconditioned variants of SPA. We also provide significantly improved error bounds for SPA, by a factor proportional to the conditioning of the $r$ vertices, in two special cases: for the first extracted vertex, and when $r \leq 2$. We then provide further improvements for the error bounds of a translated version of SPA proposed by Arora et al. (''A practical algorithm for topic modeling with provable guarantees'', ICML, 2013) in two special cases: for the first two extracted vertices, and when $r \leq 3$. Finally, we propose a new more robust variant of SPA that first shifts and lifts the data points in order to minimize the conditioning of the problem. We illustrate our results on synthetic data.

On the Robustness of the Successive Projection Algorithm

TL;DR

This work analyzes the robustness of the successive projection algorithm (SPA) for separable SSMF under noise, formalizing how the conditioning of the vertex matrix

governs recovery error. It introduces tighter bounds for the first SPA step, extends improved guarantees to the rank-2 and certain translated variants (T-SPA), and proves tightness results for SPA, SPA

, and MVIE-based preconditioning. A novel translation+lifting variant (TL-SPA) is proposed to reduce conditioning and improve practical robustness, with validated gains on synthetic datasets including adversarial middle-point noise and rank-deficient scenarios. Overall, the results provide both theoretical guarantees and practical guidance for selecting SPA variants and preprocessing to reliably recover latent simplex vertices in noisy environments.

Abstract

The successive projection algorithm (SPA) is a workhorse algorithm to learn the

vertices of the convex hull of a set of

-dimensional data points, a.k.a. a latent simplex, which has numerous applications in data science. In this paper, we revisit the robustness to noise of SPA and several of its variants. In particular, when

, we prove the tightness of the existing error bounds for SPA and for two more robust preconditioned variants of SPA. We also provide significantly improved error bounds for SPA, by a factor proportional to the conditioning of the

vertices, in two special cases: for the first extracted vertex, and when

. We then provide further improvements for the error bounds of a translated version of SPA proposed by Arora et al. (''A practical algorithm for topic modeling with provable guarantees'', ICML, 2013) in two special cases: for the first two extracted vertices, and when

. Finally, we propose a new more robust variant of SPA that first shifts and lifts the data points in order to minimize the conditioning of the problem. We illustrate our results on synthetic data.

On the Robustness of the Successive Projection Algorithm

TL;DR

Abstract

On the Robustness of the Successive Projection Algorithm

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (37)