An Algorithm to Recover Shredded Random Matrices

Caelan Atamanchuk; Luc Devroye; Massimo Vicenzo

An Algorithm to Recover Shredded Random Matrices

Caelan Atamanchuk, Luc Devroye, Massimo Vicenzo

TL;DR

The paper addresses reconstructing an $n\times n$ binary matrix from unordered multisets of its rows and columns under a random Bernoulli$(p)$ model. It introduces a two-part, trie-based algorithm: Part One uses a Hamming-weight partition of columns and sub-weight signatures to uniquely identify the row permutation (and hence the column order via a column trie) when possible; Part Two enumerates and validates residual row-permutations consistent with signature multiplicities, outputting permutation groups for columns when duplicates exist. The authors prove that for sufficiently large $p$, the algorithm runs in $O(n^2)$ time with high probability and in expectation, and they establish reconstructibility thresholds showing that a random matrix is reconstructible w.h.p. above $p \approx \frac{2\log n}{n}$, with stronger guarantees in denser regimes. These results connect reconstruction of shredded matrices to broader themes in graph reconstruction, canonization, and shotgun assembly, and they provide a concrete, efficient method for recovering original orderings in random settings.

Abstract

Given some binary matrix $M$, suppose we are presented with the collection of its rows and columns in independent arbitrary orderings. From this information, are we able to recover the unique original orderings and matrix? We present an algorithm that identifies whether there is a unique ordering associated with a set of rows and columns, and outputs either the unique correct orderings for the rows and columns or the full collection of all valid orderings and valid matrices. We show that there is a constant $c > 0$ such that the algorithm terminates in $O(n^2)$ time with high probability and in expectation for random $n \times n$ binary matrices with i.i.d.\ Bernoulli $(p)$ entries $(m_{ij})_{ij=1}^n$ such that $\frac{c\log^2(n)}{n(\log\log(n))^2} \leq p \leq \frac{1}{2}$.

An Algorithm to Recover Shredded Random Matrices

TL;DR

The paper addresses reconstructing an

binary matrix from unordered multisets of its rows and columns under a random Bernoulli

model. It introduces a two-part, trie-based algorithm: Part One uses a Hamming-weight partition of columns and sub-weight signatures to uniquely identify the row permutation (and hence the column order via a column trie) when possible; Part Two enumerates and validates residual row-permutations consistent with signature multiplicities, outputting permutation groups for columns when duplicates exist. The authors prove that for sufficiently large

, the algorithm runs in

time with high probability and in expectation, and they establish reconstructibility thresholds showing that a random matrix is reconstructible w.h.p. above

, with stronger guarantees in denser regimes. These results connect reconstruction of shredded matrices to broader themes in graph reconstruction, canonization, and shotgun assembly, and they provide a concrete, efficient method for recovering original orderings in random settings.

Abstract

Given some binary matrix

, suppose we are presented with the collection of its rows and columns in independent arbitrary orderings. From this information, are we able to recover the unique original orderings and matrix? We present an algorithm that identifies whether there is a unique ordering associated with a set of rows and columns, and outputs either the unique correct orderings for the rows and columns or the full collection of all valid orderings and valid matrices. We show that there is a constant

such that the algorithm terminates in

time with high probability and in expectation for random

binary matrices with i.i.d.\ Bernoulli

entries

such that

Paper Structure (10 sections, 10 theorems, 62 equations)

This paper contains 10 sections, 10 theorems, 62 equations.

Introduction
Related Work and Motivation
The Reconstruction Algorithm
Part One
Part Two
An Example
Time Complexity
Main Result
Unique Reconstructibility
Proofs of Lemmas

Key Result

Theorem 1

If $p\geq \frac{16(1+\epsilon)\log^2(n)}{n (\log\log(n))^2}$ for $\epsilon > 0$, then, Hence, the algorithm succeeds in producing a unique reconstruction in $O(n^2)$ time with high probability. Furthermore, if $p\geq \frac{36(1+\epsilon)\log^2(n)}{n (\log\log(n))^2}$ for $\epsilon > 0$, the expected running time of the algorithm is also $O(n^2)$, with the expected number of permutatio

Theorems & Definitions (17)

Theorem 1
Lemma 2
Lemma 3
proof : Proof of Theorem \ref{['alg']}
Theorem 4
Lemma 5
proof : Proof of Theorem \ref{['thresh']}
Lemma 2
proof
Lemma 3
...and 7 more

An Algorithm to Recover Shredded Random Matrices

TL;DR

Abstract

An Algorithm to Recover Shredded Random Matrices

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (17)