Reliable Hierarchical Operating System Fingerprinting via Conformal Prediction
Rubén Pérez-Jove, Osvaldo Simeone, Alejandro Pazos, Jose Vázquez-Naya
TL;DR
This work tackles uncertainty in OS fingerprinting by embedding Conformal Prediction (CP) within a hierarchical OS taxonomy, addressing the open-set and imbalance nature of network traffic. It evaluates two structured CP strategies: level-wise CP (L-CP), which calibrates each level independently, and projection-based CP (P-CP), which ensures hierarchical consistency by projecting leaf-level decisions upward. The results show that CP provides valid coverage for both schemes but reveal a trade-off: L-CP offers tighter, human-friendly predictions with potential taxonomic inconsistencies, while P-CP guarantees nested, coherent predictions suitable for automated policy enforcement at the cost of larger coarse-level sets. The findings guide operational deployment and point to avenues for improving efficiency and adapting CP to hierarchical, drift-prone security environments.
Abstract
Operating System (OS) fingerprinting is critical for network security, but conventional methods do not provide formal uncertainty quantification mechanisms. Conformal Prediction (CP) could be directly wrapped around existing methods to obtain prediction sets with guaranteed coverage. However, a direct application of CP would treat OS identification as a flat classification problem, ignoring the natural taxonomic structure of OSs and providing brittle point predictions. This work addresses these limitations by introducing and evaluating two distinct structured CP strategies: level-wise CP (L-CP), which calibrates each hierarchy level independently, and projection-based CP (P-CP), which ensures structural consistency by projecting leaf-level sets upwards. Our results demonstrate that, while both methods satisfy validity guarantees, they expose a fundamental trade-off between level-wise efficiency and structural consistency. L-CP yields tighter prediction sets suitable for human forensic analysis but suffers from taxonomic inconsistencies. Conversely, P-CP guarantees hierarchically consistent, nested sets ideal for automated policy enforcement, albeit at the cost of reduced efficiency at coarser levels.
