Table of Contents
Fetching ...

Line Cover and Related Problems

Matthias Bentert, Fedor v. Fomin, Petr A. Golovach, Souvik Saha, Sanjay Seetharaman, Kirill Simonov, Anannya Upasana

Abstract

We study extensions of the classic \emph{Line Cover} problem, which asks whether a set of $n$ points in the plane can be covered using $k$ lines. Line Cover is known to be NP-hard, and we focus on two natural generalizations. The first is \textbf{Line Clustering}, where the goal is to find $k$ lines minimizing the sum of squared distances from the input points to their nearest line. The second is \textbf{Hyperplane Cover}, which asks whether $n$ points in $\mathbb{R}^d$ can be covered by $k$ hyperplanes. We also study the more general \textbf{Projective Clustering} problem, which unifies both settings and has applications in machine learning, data analysis, and computational geometry. In this problem, one seeks $k$ affine subspaces of dimension $r$ that minimize the sum of squared distances from the given points in $\mathbb{R}^d$ to the nearest subspace. Our results reveal notable differences in the parameterized complexity of these problems. While Line Cover is fixed-parameter tractable when parameterized by $k$, we show that Line Clustering is W[1]-hard with respect to $k$ and does not admit an algorithm with running time $n^{o(k)}$ unless the Exponential Time Hypothesis fails. Hyperplane Cover has been known to be NP-hard since the 1980s, following work of Megiddo and Tamir, even for $d=2$, we show that it remains NP-hard even when $k=2$. Finally, we present an algorithm for Projective Clustering running in $n^{O(dk(r+1))}$ time. This bound matches our lower bound for Line Clustering and generalizes the classic algorithm for $k$-Means Clustering ($r=0$) by Inaba, Katoh, and Imai [SoCG 1994].

Line Cover and Related Problems

Abstract

We study extensions of the classic \emph{Line Cover} problem, which asks whether a set of points in the plane can be covered using lines. Line Cover is known to be NP-hard, and we focus on two natural generalizations. The first is \textbf{Line Clustering}, where the goal is to find lines minimizing the sum of squared distances from the input points to their nearest line. The second is \textbf{Hyperplane Cover}, which asks whether points in can be covered by hyperplanes. We also study the more general \textbf{Projective Clustering} problem, which unifies both settings and has applications in machine learning, data analysis, and computational geometry. In this problem, one seeks affine subspaces of dimension that minimize the sum of squared distances from the given points in to the nearest subspace. Our results reveal notable differences in the parameterized complexity of these problems. While Line Cover is fixed-parameter tractable when parameterized by , we show that Line Clustering is W[1]-hard with respect to and does not admit an algorithm with running time unless the Exponential Time Hypothesis fails. Hyperplane Cover has been known to be NP-hard since the 1980s, following work of Megiddo and Tamir, even for , we show that it remains NP-hard even when . Finally, we present an algorithm for Projective Clustering running in time. This bound matches our lower bound for Line Clustering and generalizes the classic algorithm for -Means Clustering () by Inaba, Katoh, and Imai [SoCG 1994].

Paper Structure

This paper contains 8 sections, 8 theorems, 18 equations, 6 figures.

Key Result

Theorem 1

Line Clustering is $\mathop{\mathrm{W}}\nolimits[1]$-hard parameterized by $k$. Assuming the $\mathop{\mathrm{ETH\xspace}}\nolimits$, it cannot be solved in $n^{o(k)}$ time.

Figures (6)

  • Figure 1: An instance of Line Clustering with $k=3$ and a solution (the three colored lines). The points are colored based on the color of the nearest line in the selected solution. The dashed lines show the region where points have equal distance from two solution lines.
  • Figure 2: A simple input instance of Regular Multicolored Independent Set with $\ell =2$, $\nu = 3$, and $q=1$ on the left. The right shows a set of points such that any set of $\ell$ vertical and $\ell$ horizontal lines covering at least $\ell^2\nu + \ell(\nu-1+q)=18$ points corresponds to a colorful independent set for the left instance. The colored dashed lines represent the independent set $\{a,e\}$. Note that whenever a vertical and a horizontal line of the solution cross, there are no points on the intersection.
  • Figure 3: The updated construction (for the same instance as in \ref{['fig:highlevel']}). The violet lines represent lines that are part of any optimal solution and the blue dots with numbers next to them represent a collection of points at the same position. The value of $d_{s}$ is at least $500$. The colored dashed lines (together with the violet lines) represent an optimal solution corresponding to the independent set $\{a,e\}$. The horizontal line representing vertex $a$ is the closest line in the solution to $4$ points each on the horizontal lines representing $b$ and $c$. Since the distance to points on the line for $b$ is one, this adds a cost of four to the budget. Points on the line for $c$ have distance $2$, so each points contributes $4$. Summed up over all of these points, this gives an additional budget of $20$ which is balanced out by the blue vertex on the line representing $a$. For the edge representing $e$, there are 8 points at distance one (four on the lines for $d$ and $f$, respectively). Note that the blue point on the line for $e$ represents $8$ points. Figure not drawn to scale due to the orders of magnitude difference in the distances.
  • Figure 4: An illustration of the construction of points in $F$ for $\ell = 3$ (and ${k = 10}$). The blue dots represent $n^{90} + 1$ points at the same coordinate. The four purple lines show the fixed lines and the rest of the construction happens in the box in the middle. Figure not drawn to scale as $d_{\ell}$ is larger than $d_{s}$ by many orders of magnitude.
  • Figure 5: An illustration of the construction of points in $S \setminus F$ for an instance with $\ell = 2$ and $\nu = 3$ vertices of each color. Black dots represent $p$ points at the same coordinate, blue dots represent roughly $W$ points at the same coordinate, and the violet lines represent the fixed lines. Figure not drawn to scale since $d_{s}$ is too big.
  • ...and 1 more figures

Theorems & Definitions (11)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 3
  • proof
  • Theorem 3
  • proof
  • Proposition 1: 10.5555/1197095
  • Proposition 2
  • Theorem 3
  • ...and 1 more