Table of Contents
Fetching ...

Using oriented matroids to find low rank structure in presence of nonlinearity

Caitlin Lienkaemper

TL;DR

It is shown that monotone rank is difficult to compute: the problem of deciding whether a matrix has monotone rank two is already NP-hard, and an "oriented matroid completion" problem is introduced as a combinatorial relaxation of the monotone rank problem and it is shown that checking whether a set of sign vectors has matroid completion rank two is easy.

Abstract

Estimating the linear dimensionality of a data set in the presence of noise is a common problem. However, data may also be corrupted by monotone nonlinear distortion that preserves the ordering of matrix entries but causes linear methods for estimating rank to fail. In light of this, we consider the problem of computing \emph{underlying rank}, which is the lowest rank consistent with the ordering of matrix entries, and \emph{monotone rank}, which is the lowest rank consistent with the ordering within columns. We show that each matrix of monotone rank $d$ corresponds to a point arrangement and a hyperplane arrangement in $\mathbb R^{d}$, and that the ordering within columns of the matrix can be used to recover information about these arrangements. Using Radon's theorem and the related concept of the VC dimension, we can obtain lower bounds on the monotone rank of a matrix. However, we also show that the monotone rank of a matrix can exceed these bounds. In order to obtain better bounds on monotone rank, we develop the connection between monotone rank estimation and oriented matroid theory. Using this connection, we show that monotone rank is difficult to compute: the problem of deciding whether a matrix has monotone rank two is already NP-hard. However, we introduce an "oriented matroid completion" problem as a combinatorial relaxation of the monotone rank problem and show that checking whether a set of sign vectors has matroid completion rank two is easy.

Using oriented matroids to find low rank structure in presence of nonlinearity

TL;DR

It is shown that monotone rank is difficult to compute: the problem of deciding whether a matrix has monotone rank two is already NP-hard, and an "oriented matroid completion" problem is introduced as a combinatorial relaxation of the monotone rank problem and it is shown that checking whether a set of sign vectors has matroid completion rank two is easy.

Abstract

Estimating the linear dimensionality of a data set in the presence of noise is a common problem. However, data may also be corrupted by monotone nonlinear distortion that preserves the ordering of matrix entries but causes linear methods for estimating rank to fail. In light of this, we consider the problem of computing \emph{underlying rank}, which is the lowest rank consistent with the ordering of matrix entries, and \emph{monotone rank}, which is the lowest rank consistent with the ordering within columns. We show that each matrix of monotone rank corresponds to a point arrangement and a hyperplane arrangement in , and that the ordering within columns of the matrix can be used to recover information about these arrangements. Using Radon's theorem and the related concept of the VC dimension, we can obtain lower bounds on the monotone rank of a matrix. However, we also show that the monotone rank of a matrix can exceed these bounds. In order to obtain better bounds on monotone rank, we develop the connection between monotone rank estimation and oriented matroid theory. Using this connection, we show that monotone rank is difficult to compute: the problem of deciding whether a matrix has monotone rank two is already NP-hard. However, we introduce an "oriented matroid completion" problem as a combinatorial relaxation of the monotone rank problem and show that checking whether a set of sign vectors has matroid completion rank two is easy.
Paper Structure (16 sections, 27 theorems, 61 equations, 7 figures, 1 algorithm)

This paper contains 16 sections, 27 theorems, 61 equations, 7 figures, 1 algorithm.

Key Result

Theorem 1

For any matrix $A$, the Radon rank and the VC rank are both lower bounds for the monotone rank.

Figures (7)

  • Figure 1: (a) The matrix $A$ from Example \ref{['ex:distortion']}. Since $A$ has monotone rank two, by Observation \ref{['obs:sweep_orders']} there exist points $\mathcal{P} = \{p_1, p_2, p_3, p_4 \}\subset \mathbb R^2$, hyperplane normals $\mathcal{H} = \{h_1, h_2, h_3\} \subset \mathbb R^2$, and monotone functions $\mathcal{F} = \{f_1, f_2, f_3\}$ such that $A_{ij} = f_j(p_i \cdot h_j)$. (b) The point arrangement $\mathcal{P}$. The order of entries in the second column of $B$ matches the order in which a hyperplane with normal vector $h_2$ sweeps past the points $p_1, p_2, p_3, p_4$. (c) The hyperplane arrangement $\mathcal{H}$. The hyperplanes $H_1, H_2,$ and $H_3$ have normal vectors $h_1, h_2,$ and $h_3$.
  • Figure 2: Rotating a vector $v$ to obtain different sweep permutations of $\mathcal{P}$. (a) The sweep permutation $\pi_v = (3214)$. (b) The sweep permutation $\pi_v = (3124)$. (c) The sweep permutation $\pi_v = (3124)$.
  • Figure 3: The threshold vectors $\Sigma_{\mathop{\mathrm{thresh}}\nolimits}(A)$. (a) We have $\sigma_{2}(3) = +--+$, since only the first and fourth entries in the second column of $A$ are above the threshold 3. (b) Only the points $p_1$ and $p_4$ are on the positive side of the illustrated hyperplane. The illustrated hyperplane has normal vector $h_2$.
  • Figure 4: Two point arrangements with the same set of topes, $\{+, -\}^4 \setminus \{+-+-, -+-+\}$, but differing sets of sweep permutations. (a) 4132 and 3214 are permutations of $\Pi(\mathcal{P})$, but not $\Pi(\mathcal{P}')$. (b) 1432 and 2341 are permutations of $\Pi(\mathcal{P}')$, but not $\Pi(\mathcal{P})$. As we slide the point $p_4$ along the line connecting it to $p_3$, the change in $\Pi(\mathcal{P})$ happens when $p_4$ crosses the dashed line, and the line segment connecting $p_1$ to $p_4$ becomes parallel to the line segment connecting $p_2$ to $p_3$.
  • Figure 5: The difference vectors $\Sigma_{\mathop{\mathrm{diff}}\nolimits}(A)$. (a) We have $\sigma_{13} = ++-$, since $a_{11} - a_{31} > 0, a_{12} - a_{32} >0, a_{13} - a_{33} < 0$. (b) The point $p_1 - p_3$ lies in the chamber $++-$, on the positive sides of $H_1$ and $H_2$ and the negative side of $H_3$.
  • ...and 2 more figures

Theorems & Definitions (71)

  • Example 1
  • Definition 2
  • Definition 3
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Lemma 5: piziak1999full
  • Definition 7
  • ...and 61 more