Sample and Expand: Discovering Low-rank Submatrices With Quality Guarantees
Martino Ciaperoni, Aristides Gionis, Heikki Mannila
TL;DR
This paper addresses discovering submatrices that are provably close to a low-rank representation when the entire matrix is not globally low-rank. It introduces Sample-And-Expand, a two-phase method that first seeds a $2 \times 2$ near-rank-$1$ submatrix and then expands it to a larger near-low-rank submatrix while controlling the approximation error, with generalization to near-rank-$k$ patterns. The authors formalize LNROSR, LNROS, and LNR$k$S, prove NP-hardness for the latter two, and derive approximation guarantees linking the expansion process to row/column anchor ratios, along with probabilistic and scalability analyses. They validate the approach against strong baselines on synthetic and real data, showing favorable performance in recovering interpretable, local low-rank structures. Overall, the method provides provable guarantees and practical scalability for identifying local low-rank patterns across diverse domains.
Abstract
The problem of approximating a matrix by a low-rank one has been extensively studied. This problem assumes, however, that the whole matrix has a low-rank structure. This assumption is often false for real-world matrices. We consider the problem of discovering submatrices from the given matrix with bounded deviations from their low-rank approximations. We introduce an effective two-phase method for this task: first, we use sampling to discover small nearly low-rank submatrices, and then they are expanded while preserving proximity to a low-rank approximation. An extensive experimental evaluation confirms that the method we introduce compares favorably to existing approaches.
