Robust Classification of Dynamic Bichromatic point Sets in R2
Erwin Glazenburg, Frank Staals, Marc van Kreveld
TL;DR
The paper addresses robust 2D linear separation for bichromatic point sets with a bound on misclassifications and dynamic updates. It introduces the concepts $M_{\rm mis}(s)$, $S_k(B \cup R)$, and the distance function $\operatorname{dist}$ to formalize separable and non-separable cases, including four problem variants that range from exact maximum-margin separation to minimum-width strips containing $k$ outliers. The authors implement an efficient semi-online dynamic data structure and provide both an exact algorithm with $O(nk + n \log n)$ time and a $(1+\varepsilon)$-approximation algorithm with $O(\varepsilon^{-1/2}((n + k^2) \log n))$ time, with the approximation enabling the semi-online maintenance of a near-optimal separator. These results advance practical robust classification in streaming and dynamic contexts by delivering guaranteed misclassification bounds and efficient update mechanisms in the planar setting.
Abstract
Let $R \cup B$ be a set of $n$ points in $\mathbb{R}^2$, and let $k \in 1..n$. Our goal is to compute a line that "best" separates the "red" points $R$ from the "blue" points $B$ with at most $k$ outliers. We present an efficient semi-online dynamic data structure that can maintain whether such a separator exists. Furthermore, we present efficient exact and approximation algorithms that compute a linear separator that is guaranteed to misclassify at most $k$, points and minimizes the distance to the farthest outlier. Our exact algorithm runs in $O(nk + n \log n)$ time, and our $(1+\varepsilon)$-approximation algorithm runs in $O(\varepsilon^{-1/2}((n + k^2) \log n))$ time. Based on our $(1+\varepsilon)$-approximation algorithm we then also obtain a semi-online data structure to maintain such a separator efficiently.
