Table of Contents
Fetching ...

An Algorithm to Recover Shredded Random Matrices

Caelan Atamanchuk, Luc Devroye, Massimo Vicenzo

TL;DR

The paper addresses reconstructing an $n\times n$ binary matrix from unordered multisets of its rows and columns under a random Bernoulli$(p)$ model. It introduces a two-part, trie-based algorithm: Part One uses a Hamming-weight partition of columns and sub-weight signatures to uniquely identify the row permutation (and hence the column order via a column trie) when possible; Part Two enumerates and validates residual row-permutations consistent with signature multiplicities, outputting permutation groups for columns when duplicates exist. The authors prove that for sufficiently large $p$, the algorithm runs in $O(n^2)$ time with high probability and in expectation, and they establish reconstructibility thresholds showing that a random matrix is reconstructible w.h.p. above $p \approx \frac{2\log n}{n}$, with stronger guarantees in denser regimes. These results connect reconstruction of shredded matrices to broader themes in graph reconstruction, canonization, and shotgun assembly, and they provide a concrete, efficient method for recovering original orderings in random settings.

Abstract

Given some binary matrix $M$, suppose we are presented with the collection of its rows and columns in independent arbitrary orderings. From this information, are we able to recover the unique original orderings and matrix? We present an algorithm that identifies whether there is a unique ordering associated with a set of rows and columns, and outputs either the unique correct orderings for the rows and columns or the full collection of all valid orderings and valid matrices. We show that there is a constant $c > 0$ such that the algorithm terminates in $O(n^2)$ time with high probability and in expectation for random $n \times n$ binary matrices with i.i.d.\ Bernoulli $(p)$ entries $(m_{ij})_{ij=1}^n$ such that $\frac{c\log^2(n)}{n(\log\log(n))^2} \leq p \leq \frac{1}{2}$.

An Algorithm to Recover Shredded Random Matrices

TL;DR

The paper addresses reconstructing an binary matrix from unordered multisets of its rows and columns under a random Bernoulli model. It introduces a two-part, trie-based algorithm: Part One uses a Hamming-weight partition of columns and sub-weight signatures to uniquely identify the row permutation (and hence the column order via a column trie) when possible; Part Two enumerates and validates residual row-permutations consistent with signature multiplicities, outputting permutation groups for columns when duplicates exist. The authors prove that for sufficiently large , the algorithm runs in time with high probability and in expectation, and they establish reconstructibility thresholds showing that a random matrix is reconstructible w.h.p. above , with stronger guarantees in denser regimes. These results connect reconstruction of shredded matrices to broader themes in graph reconstruction, canonization, and shotgun assembly, and they provide a concrete, efficient method for recovering original orderings in random settings.

Abstract

Given some binary matrix , suppose we are presented with the collection of its rows and columns in independent arbitrary orderings. From this information, are we able to recover the unique original orderings and matrix? We present an algorithm that identifies whether there is a unique ordering associated with a set of rows and columns, and outputs either the unique correct orderings for the rows and columns or the full collection of all valid orderings and valid matrices. We show that there is a constant such that the algorithm terminates in time with high probability and in expectation for random binary matrices with i.i.d.\ Bernoulli entries such that .
Paper Structure (10 sections, 10 theorems, 62 equations)

This paper contains 10 sections, 10 theorems, 62 equations.

Key Result

Theorem 1

If $p\geq \frac{16(1+\epsilon)\log^2(n)}{n (\log\log(n))^2}$ for $\epsilon > 0$, then, Hence, the algorithm succeeds in producing a unique reconstruction in $O(n^2)$ time with high probability. Furthermore, if $p\geq \frac{36(1+\epsilon)\log^2(n)}{n (\log\log(n))^2}$ for $\epsilon > 0$, the expected running time of the algorithm is also $O(n^2)$, with the expected number of permutatio

Theorems & Definitions (17)

  • Theorem 1
  • Lemma 2
  • Lemma 3
  • proof : Proof of Theorem \ref{['alg']}
  • Theorem 4
  • Lemma 5
  • proof : Proof of Theorem \ref{['thresh']}
  • Lemma 2
  • proof
  • Lemma 3
  • ...and 7 more