General Position Subset Selection in Line Arrangements
Adrian Dumitrescu
TL;DR
This work investigates General Position Subset Selection (GPSS), the problem of selecting a largest subset of $n$ plane points with no three collinear, known to be NP-hard and APX-hard. It develops three improved approximation regimes by harnessing probabilistic methods and incidence geometry: (i) a constant-factor approximation for $\alpha$-dense lattice point sets with factor $c(\alpha)=Ω(α^{-2})$, (ii) an $Ω((\log n)^{-1/2})$-approximation for vertices of generic line arrangements, and (iii) an $Ω((\log n)^{-1/2})$-approximation when $\ell(P)=O(\sqrt{n})$ and $\kappa(P)=O(\sqrt{n})$. The core strategies mix grid-based Vandermonde-inspired partitions, and a two-step random sampling with deletion in line arrangements, underpinned by Beck-type incidence bounds and duality. Together, these results show that structured input sets permit substantially stronger GPSS guarantees than the general bound $Ω(n^{-1/2})$, with practical implications for selecting large in-general-position subsets in lattice-like and line-arrangement geometries.
Abstract
Given a set of points in the plane, the \textsc{General Position Subset Selection} problem is that of finding a maximum-size subset of points in general position, i.e., with no three points collinear. The problem is known to be ${\rm NP}$-complete and ${\rm APX}$-hard, and the best approximation ratio known is $Ω\left({\rm OPT}^{-1/2}\right) =Ω(n^{-1/2})$. Here we obtain better approximations in three specials cases: (I) A constant factor approximation for the case where the input set consists of lattice points and is \emph{dense}, which means that the ratio between the maximum and the minimum distance in $P$ is of the order of $Θ(\sqrt{n})$. (II) An $Ω\left((\log{n})^{-1/2}\right)$-approximation for the case where the input set is the set of vertices of a \emph{generic} $n$-line arrangement, i.e., one with $Ω(n^2)$ vertices. The scenario in (I) is a special case of that in (II). (III) An $Ω\left((\log{n})^{-1/2}\right)$-approximation for the case where the input set has at most $O(\sqrt{n})$ points collinear and can be covered by $O(\sqrt{n})$ lines. Our approximations rely on probabilistic methods and results from incidence geometry.
