Revisiting Non-separable Binary Classification and its Applications in Anomaly Detection
Matthew Lau, Ismaila Seck, Athanasios P Meliopoulos, Wenke Lee, Eugene Ndiaye
TL;DR
The paper reconceptualizes a classic nonlinearly separable problem (XOR) through equality separation, a linear rule that classifies by distance to a learned hyperplane and can be integrated into neural networks via smooth bump activations. It proves that equality separators have twice the VC dimension of standard halfspaces (exactly $2n+1$ for strict separators and between $2n+1$ and $2n+3$ for $\epsilon$-error variants), and introduces closing numbers to quantify the capacity to form closed decision regions, linking locality and anomaly detection. The authors connect equality separation to existing non-linear classifiers (hyper-ridge/hyper-hill/OVS) and show that with appropriate inductive bias it yields robust detection of both seen and unseen anomalies, supported by toy and real-world experiments across cyber-security, medical, and industrial datasets. The framework provides a principled way to balance learning and robust AD via margin-focused containment of normal data, with practical implications for designing neural nets and foundation-model pipelines capable of reliable anomaly detection in high-dimensional spaces.
Abstract
The inability to linearly classify XOR has motivated much of deep learning. We revisit this age-old problem and show that linear classification of XOR is indeed possible. Instead of separating data between halfspaces, we propose a slightly different paradigm, equality separation, that adapts the SVM objective to distinguish data within or outside the margin. Our classifier can then be integrated into neural network pipelines with a smooth approximation. From its properties, we intuit that equality separation is suitable for anomaly detection. To formalize this notion, we introduce closing numbers, a quantitative measure on the capacity for classifiers to form closed decision regions for anomaly detection. Springboarding from this theoretical connection between binary classification and anomaly detection, we test our hypothesis on supervised anomaly detection experiments, showing that equality separation can detect both seen and unseen anomalies.
