Table of Contents
Fetching ...

Projection onto the probability simplex: An efficient algorithm with a simple proof, and an application

Weiran Wang, Miguel Á. Carreira-Perpiñán

TL;DR

The paper addresses the problem of efficiently projecting a vector onto the probability simplex by introducing a simple, non-iterative $\mathcal{O}(D\log D)$ algorithm that computes $\mathbf{x}=\max\{\mathbf{y}+\lambda,0\}$ with $\lambda$ chosen so $\mathbf{x}^\top \mathbf{1}=1$, and provides a direct KKT-based proof of correctness. It also offers a MATLAB implementation and demonstrates the method within Laplacian $K$-modes clustering, where simplex projections arise in both training and out-of-sample mapping for soft cluster assignments. The main contributions are the elementary proof, the straightforward, efficient algorithm, and the practical application to a clustering framework, enabling fast and stable simplex projections in constrained optimization and large-scale learning tasks. This work facilitates robust normalization and soft-assignment computations in scenarios requiring projection onto the simplex, with broad utility in machine learning and data clustering pipelines.

Abstract

We provide an elementary proof of a simple, efficient algorithm for computing the Euclidean projection of a point onto the probability simplex. We also show an application in Laplacian K-modes clustering.

Projection onto the probability simplex: An efficient algorithm with a simple proof, and an application

TL;DR

The paper addresses the problem of efficiently projecting a vector onto the probability simplex by introducing a simple, non-iterative algorithm that computes with chosen so , and provides a direct KKT-based proof of correctness. It also offers a MATLAB implementation and demonstrates the method within Laplacian -modes clustering, where simplex projections arise in both training and out-of-sample mapping for soft cluster assignments. The main contributions are the elementary proof, the straightforward, efficient algorithm, and the practical application to a clustering framework, enabling fast and stable simplex projections in constrained optimization and large-scale learning tasks. This work facilitates robust normalization and soft-assignment computations in scenarios requiring projection onto the simplex, with broad utility in machine learning and data clustering pipelines.

Abstract

We provide an elementary proof of a simple, efficient algorithm for computing the Euclidean projection of a point onto the probability simplex. We also show an application in Laplacian K-modes clustering.

Paper Structure

This paper contains 10 sections, 1 theorem, 10 equations, 1 algorithm.

Key Result

Theorem 1

Let $\rho$ be the number of positive components in the solution $\mathbf{x}$, then

Theorems & Definitions (2)

  • Theorem 1
  • proof