Alternating minimization algorithm with initialization analysis for r-local and k-sparse unlabeled sensing

Ahmed Abbasi; Shuchin Aeron; Abiy Tasissa

Alternating minimization algorithm with initialization analysis for r-local and k-sparse unlabeled sensing

Ahmed Abbasi, Shuchin Aeron, Abiy Tasissa

TL;DR

This work addresses unlabeled sensing where measurements are scrambled by an unknown permutation: $\mathbf{Y} = \mathbf{P}^* \mathbf{B} \mathbf{X}^* + \mathbf{W}$. It introduces an alternating minimization algorithm with initialization strategies tailored to two structured permutation models: $r$-local (block-diagonal) and $k$-sparse permutations, and provides rigorous initialization-error bounds under Gaussian or sub-Gaussian assumptions. For $r$-local permutations, the initialization error decays with increasing block count via a Johnson-Lindenstrauss-type analysis, yielding sub-Gaussian tails in terms of $(d-s)$; for $k$-sparse permutations, the error scales with $k/n$ under a tall-Gaussian $\mathbf{B}$ and decays with a sub-exponential tail. Empirically, AltMin is fast, robust to the measurement matrix, and outperforms several baselines on synthetic and real datasets, highlighting the practical viability of structured unlabeled sensing and motivating future work on convergence-rate guarantees.

Abstract

Unlabeled sensing is a linear inverse problem with permuted measurements. We propose an alternating minimization (AltMin) algorithm with a suitable initialization for two widely considered permutation models: partially shuffled/$k$-sparse permutations and $r$-local/block diagonal permutations. Key to the performance of the AltMin algorithm is the initialization. For the exact unlabeled sensing problem, assuming either a Gaussian measurement matrix or a sub-Gaussian signal, we bound the initialization error in terms of the number of blocks $s$ and the number of shuffles $k$. Experimental results show that our algorithm is fast, applicable to both permutation models, and robust to choice of measurement matrix. We also test our algorithm on several real datasets for the linked linear regression problem and show superior performance compared to baseline methods.

Alternating minimization algorithm with initialization analysis for r-local and k-sparse unlabeled sensing

TL;DR

This work addresses unlabeled sensing where measurements are scrambled by an unknown permutation:

. It introduces an alternating minimization algorithm with initialization strategies tailored to two structured permutation models:

-local (block-diagonal) and

-sparse permutations, and provides rigorous initialization-error bounds under Gaussian or sub-Gaussian assumptions. For

-local permutations, the initialization error decays with increasing block count via a Johnson-Lindenstrauss-type analysis, yielding sub-Gaussian tails in terms of

; for

-sparse permutations, the error scales with

under a tall-Gaussian

and decays with a sub-exponential tail. Empirically, AltMin is fast, robust to the measurement matrix, and outperforms several baselines on synthetic and real datasets, highlighting the practical viability of structured unlabeled sensing and motivating future work on convergence-rate guarantees.

Abstract

-sparse permutations and

-local/block diagonal permutations. Key to the performance of the AltMin algorithm is the initialization. For the exact unlabeled sensing problem, assuming either a Gaussian measurement matrix or a sub-Gaussian signal, we bound the initialization error in terms of the number of blocks

and the number of shuffles

. Experimental results show that our algorithm is fast, applicable to both permutation models, and robust to choice of measurement matrix. We also test our algorithm on several real datasets for the linked linear regression problem and show superior performance compared to baseline methods.

Paper Structure (26 sections, 9 theorems, 42 equations, 3 figures, 3 tables, 2 algorithms)

This paper contains 26 sections, 9 theorems, 42 equations, 3 figures, 3 tables, 2 algorithms.

Introduction
Contributions and outline
Outline:
Notation
Related Work
Theory and algorithms
Inference on unlabeled data
Applications of unlabeled sensing
Technical background
Algorithm
Initialization analysis
Analysis for $r$-local permutation
Analysis for $k$-sparse permutation
Results
Baselines.
...and 11 more sections

Key Result

Theorem 2.1

Let $\mathbf{\Sigma} = \mathbf{A}^{\intercal}\mathbf{A}$ be a positive semi-definite matrix. Let $\mathbf{x} = (x_1,\cdots,x_d)$ be a zero-mean sub-Gaussian random vector, i.e., for $\alpha \in \mathbb{R}^d$, $K \geq 0$ For $t \geq 0$,

Figures (3)

Figure 1: Left. Sparse (or partially shuffled) permutation considered in snrslawski_two_stagezhang2019permutationslawski2020sparse, with number of shuffles $k=10$. Right. The $r$-local permutation structure considered in ojspicassp_r_localwang2023regularization, with block size $r=10$. In this paper, we propose a general algorithm for both permutation models.
Figure 2: Record linkage with Blockingmurray2016probabilistic, graph_rl assigns records in File A to records in File B, upto blocks. For example, records matching on identifiers 'city', 'occupation' etc (left) are assigned to the same block (right). Linked linear regression lahiri2005regression fits a regression model on such block-permuted data. See Section \ref{['sec:results']} for results on real datasets. The figure on the left is adapted from Figure 1 in graph_rl and the figure on the right is adapted from Figure 1 in murray2016probabilistic.
Figure 3: $\mathbf{Y} = \mathbf{P}^*\mathbf{B}_{n \times d} \mathbf{X}^*_{d \times m} + \mathbf{W}$. In figures (a,b,c,d), the normalized Hamming distortion $d_H/n$ is plotted on the y-axis against block size $r$ (a) and the number of shuffles (b,c,d). Hamming distortion $d_H$ is the number of mismatches in estimate $\widehat{\mathbf{P}}$ of $\mathbf{P}^*$ and is defined as $d_H = \Sigma_{i}\mathbbm{1} (\widehat{\mathbf{P}}(i) \neq \mathbf{P}^*(i))$, where $\mathbf{P}(i)$ denotes the column index of the $1$ entry in the $i^{th}$ row of the permutation matrix $\mathbf{P}$. A lower value of Hamming distortion is better.

Theorems & Definitions (15)

Theorem 2.1: Hanswon Wright Inequality, Theorem 2.1 in hsu2012tail
Lemma 2.2: Johnson-Lindenstrauss Lemma, Lemma 5.3.2 in vershynin2018high
Theorem 2.3: Hoeffding's inequality, Theorem 2.6.2 in vershynin2018high
Lemma 2.4: Tail inequality for $\chi^2_D$ distributed random variables, Lemma 1 in 10.1214/aos/Laurent
Theorem 4.1
proof
Theorem 4.2
Lemma 4.3
proof
Definition 4.4: rudelson2009smallest
...and 5 more

Alternating minimization algorithm with initialization analysis for r-local and k-sparse unlabeled sensing

TL;DR

Abstract

Alternating minimization algorithm with initialization analysis for r-local and k-sparse unlabeled sensing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (15)