Table of Contents
Fetching ...

Statistical mechanics of the maximum-average submatrix problem

Vittorio Erba, Florent Krzakala, Rodrigo Pérez, Lenka Zdeborová

TL;DR

This work reframes the maximum-average submatrix problem as a mean-field spin-glass model with fixed magnetization, enabling a full replica analysis of its thermodynamics. By solving a 1-RSB variational problem and inspecting stability, the authors map out a rich phase diagram in the submatrix size $m$ and average $a$, including RS, dynamical/ static 1-RSB, full-RSB, and UNSAT phases, with a frozen 1-RSB regime emerging as $m\to0$. They connect the $m\to0$ limit to REM-like behavior, link dynamic/static thresholds to known LAS results, and reveal algorithmic implications where methods like Incremental Greedy Procedure can succeed within certain frozen phases. The study also extends to rectangular MAS via bipartite SK mappings and shows, in the square case, equivalence with the symmetric principal MAS analysis, while leveraging Panchenko’s full-RSB framework for exact asymptotics. Overall, the work provides a rigorous statistical-mechanics portrait of MAS, illuminating phase structure, algorithmic regimes, and cross-model universality that informs both theory and potential heuristics for large-scale biclustering tasks.

Abstract

We study the maximum-average submatrix problem, in which given an $N \times N$ matrix $J$ one needs to find the $k \times k$ submatrix with the largest average of entries. We study the problem for random matrices $J$ whose entries are i.i.d. random variables by mapping it to a variant of the Sherrington-Kirkpatrick spin-glass model at fixed magnetization. We characterize analytically the phase diagram of the model as a function of the submatrix average and the size of the submatrix $k$ in the limit $N\to\infty$. We consider submatrices of size $k = m N$ with $0 < m < 1$. We find a rich phase diagram, including dynamical, static one-step replica symmetry breaking and full-step replica symmetry breaking. In the limit of $m \to 0$, we find a simpler phase diagram featuring a frozen 1-RSB phase, where the Gibbs measure is composed of exponentially many pure states each with zero entropy. We discover an interesting phenomenon, reminiscent of the phenomenology of the binary perceptron: there exist efficient algorithms that provably work in the frozen 1-RSB phase.

Statistical mechanics of the maximum-average submatrix problem

TL;DR

This work reframes the maximum-average submatrix problem as a mean-field spin-glass model with fixed magnetization, enabling a full replica analysis of its thermodynamics. By solving a 1-RSB variational problem and inspecting stability, the authors map out a rich phase diagram in the submatrix size and average , including RS, dynamical/ static 1-RSB, full-RSB, and UNSAT phases, with a frozen 1-RSB regime emerging as . They connect the limit to REM-like behavior, link dynamic/static thresholds to known LAS results, and reveal algorithmic implications where methods like Incremental Greedy Procedure can succeed within certain frozen phases. The study also extends to rectangular MAS via bipartite SK mappings and shows, in the square case, equivalence with the symmetric principal MAS analysis, while leveraging Panchenko’s full-RSB framework for exact asymptotics. Overall, the work provides a rigorous statistical-mechanics portrait of MAS, illuminating phase structure, algorithmic regimes, and cross-model universality that informs both theory and potential heuristics for large-scale biclustering tasks.

Abstract

We study the maximum-average submatrix problem, in which given an matrix one needs to find the submatrix with the largest average of entries. We study the problem for random matrices whose entries are i.i.d. random variables by mapping it to a variant of the Sherrington-Kirkpatrick spin-glass model at fixed magnetization. We characterize analytically the phase diagram of the model as a function of the submatrix average and the size of the submatrix in the limit . We consider submatrices of size with . We find a rich phase diagram, including dynamical, static one-step replica symmetry breaking and full-step replica symmetry breaking. In the limit of , we find a simpler phase diagram featuring a frozen 1-RSB phase, where the Gibbs measure is composed of exponentially many pure states each with zero entropy. We discover an interesting phenomenon, reminiscent of the phenomenology of the binary perceptron: there exist efficient algorithms that provably work in the frozen 1-RSB phase.
Paper Structure (34 sections, 2 theorems, 130 equations, 3 figures)

This paper contains 34 sections, 2 theorems, 130 equations, 3 figures.

Key Result

Proposition 1

Consider the integral ($p, \ell \in {\mathbb{R}}$) where $Dv$ is the standard Gaussian measure. Define and suppose that $B_1 > 0$ (if $B_1 < 0$, change variable $v \to -v$ in the integral) and $|A_2| < +\infty$. Then, as $N \to \infty$, the leading order of $I_{p, \ell}(A, B)$ satisfies for $p < \ell$, and for $p \geq \ell$, where and $G(+\infty) = 1$.

Figures (3)

  • Figure 1: The phase diagram of the MAS problem as a function of the submatrix-average $a$ and the submatrix size $m = k/N$ for linear scale in $m$ (left), logarithmic scale in $m$ (center) and $m \to 0$ (right). In the central and right panel we rescale the sub-matrix average as $a / \sqrt{\log(1/m)/(m)}$ to highlight the convergence to the limit. We identify five distinct phases. In the RS phase (green) the system is replica symmetric. In the 1-RSB phases replica symmetry is broken to one step and two sub-phases exists, a dynamical 1-RSB with an extensive number of equilibrium pure states (blue) and a static 1-RSB with only finitely many pure state (purple). All the phase boundaries are exact in the thermodynamic limit except the boundary between full-RSB (orange) and UNSAT (red) which would require solving the full-RSB equations. In the full-RSB phase (orange) replica symmetry is completely broken and the set of pure states manifests ultrametricity. In the unsatisfiable phase (UNSAT, red) no submatrix exists with the given values of $a$ and $m$. The transition from RS and 1-RSB to full-RSB is continuous and caused by an instability of the 1-RSB ansatz (dashed line), while the other transitions are discontinuous. We observe two tricritical points, one at $(m_c, a_c)$ where the system shows coexistence of RS, 1-RSB and full-RSB phase (white marker), and one at $(m^*, a^*)$ where the largest-average submatrices become 1-RSB stable and the full-RSB region ceases to exist (black marker). In the limit $m\to 0$, we observe only the RS, 1-RSB and UNSAT phases. The 1-RSB phase is frozen, meaning that the internal entropy of each pure state goes to zero in the $m\to 0$ limit.
  • Figure S1: The phase diagram of the MAS problem as a function of the inverse temperature $\beta$ and the submatrix size $m = k/N$ for linear scale in $m$ (left) nad logarithmic scale in $m$ (right). On the right we rescaled the inverse temperature to highlight the convergence to the $m\to 0$ limit.
  • Figure S2: Behavior of the internal entropy (left panel, solid lines), total entropy (left panel, dashed lines) and the submatrix average (right panel) as a function of the rescaled inverse temperature $\beta / \sqrt{ \log(1/m) / m}$ for various values of $m$. In black, we plot the analytical predictions for the $m\to 0$ limit. We see that both observables converge to the $m\to 0$ limit extremely slowly, in accordance with our analytical analysis that shows that the next-to-leading order in the $m\ll 1$ expansion is only logarithmically small, see SM. Moreover, we see that the internal entropy in the dynamical 1-RSB region (where it is different from the total entropy due to the non-vanishing complexity) decreases as $m$ goes to zero, foreshadowing the frozen 1-RSB phase that arises in the limit $m \to 0$.

Theorems & Definitions (4)

  • Proposition 1
  • proof
  • Proposition 2
  • proof