Table of Contents
Fetching ...

Reliable Hierarchical Operating System Fingerprinting via Conformal Prediction

Rubén Pérez-Jove, Osvaldo Simeone, Alejandro Pazos, Jose Vázquez-Naya

TL;DR

This work tackles uncertainty in OS fingerprinting by embedding Conformal Prediction (CP) within a hierarchical OS taxonomy, addressing the open-set and imbalance nature of network traffic. It evaluates two structured CP strategies: level-wise CP (L-CP), which calibrates each level independently, and projection-based CP (P-CP), which ensures hierarchical consistency by projecting leaf-level decisions upward. The results show that CP provides valid coverage for both schemes but reveal a trade-off: L-CP offers tighter, human-friendly predictions with potential taxonomic inconsistencies, while P-CP guarantees nested, coherent predictions suitable for automated policy enforcement at the cost of larger coarse-level sets. The findings guide operational deployment and point to avenues for improving efficiency and adapting CP to hierarchical, drift-prone security environments.

Abstract

Operating System (OS) fingerprinting is critical for network security, but conventional methods do not provide formal uncertainty quantification mechanisms. Conformal Prediction (CP) could be directly wrapped around existing methods to obtain prediction sets with guaranteed coverage. However, a direct application of CP would treat OS identification as a flat classification problem, ignoring the natural taxonomic structure of OSs and providing brittle point predictions. This work addresses these limitations by introducing and evaluating two distinct structured CP strategies: level-wise CP (L-CP), which calibrates each hierarchy level independently, and projection-based CP (P-CP), which ensures structural consistency by projecting leaf-level sets upwards. Our results demonstrate that, while both methods satisfy validity guarantees, they expose a fundamental trade-off between level-wise efficiency and structural consistency. L-CP yields tighter prediction sets suitable for human forensic analysis but suffers from taxonomic inconsistencies. Conversely, P-CP guarantees hierarchically consistent, nested sets ideal for automated policy enforcement, albeit at the cost of reduced efficiency at coarser levels.

Reliable Hierarchical Operating System Fingerprinting via Conformal Prediction

TL;DR

This work tackles uncertainty in OS fingerprinting by embedding Conformal Prediction (CP) within a hierarchical OS taxonomy, addressing the open-set and imbalance nature of network traffic. It evaluates two structured CP strategies: level-wise CP (L-CP), which calibrates each level independently, and projection-based CP (P-CP), which ensures hierarchical consistency by projecting leaf-level decisions upward. The results show that CP provides valid coverage for both schemes but reveal a trade-off: L-CP offers tighter, human-friendly predictions with potential taxonomic inconsistencies, while P-CP guarantees nested, coherent predictions suitable for automated policy enforcement at the cost of larger coarse-level sets. The findings guide operational deployment and point to avenues for improving efficiency and adapting CP to hierarchical, drift-prone security environments.

Abstract

Operating System (OS) fingerprinting is critical for network security, but conventional methods do not provide formal uncertainty quantification mechanisms. Conformal Prediction (CP) could be directly wrapped around existing methods to obtain prediction sets with guaranteed coverage. However, a direct application of CP would treat OS identification as a flat classification problem, ignoring the natural taxonomic structure of OSs and providing brittle point predictions. This work addresses these limitations by introducing and evaluating two distinct structured CP strategies: level-wise CP (L-CP), which calibrates each hierarchy level independently, and projection-based CP (P-CP), which ensures structural consistency by projecting leaf-level sets upwards. Our results demonstrate that, while both methods satisfy validity guarantees, they expose a fundamental trade-off between level-wise efficiency and structural consistency. L-CP yields tighter prediction sets suitable for human forensic analysis but suffers from taxonomic inconsistencies. Conversely, P-CP guarantees hierarchically consistent, nested sets ideal for automated policy enforcement, albeit at the cost of reduced efficiency at coarser levels.
Paper Structure (37 sections, 22 equations, 10 figures, 2 tables)

This paper contains 37 sections, 22 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Diagram of an illustrative example of hierarchical label tree in OS fingerprinting. This example adopts three levels---OS family$\rightarrow$major version$\rightarrow$minor version---for common current desktop, server and mobile OSs (e.g., Windows, Linux distributions, macOS, Android, iOS). The structure is not canonical but a concrete, non-exhaustive instance aligned with the general $K$-level formulation used in this work. Moreover, the diagram shows that not all OS families expose all three levels; some branches terminate early (e.g., missing minor versions), yielding a typically ragged hierarchy.
  • Figure 2: Comparison of traditional OS fingerprinting and OS fingerprinting with Conformal Prediction (CP). The top section illustrates the conventional approach, where a single point prediction (Windows 11) is made based on the highest softmax probability, which may be incorrect. The bottom section demonstrates how CP generates a prediction set containing multiple plausible OS labels, including the true class (Windows 10) even when it is not the highest probability output. This set-valued approach provides calibrated uncertainty quantification, enabling risk-aware decision-making in network security applications.
  • Figure 3: Diagram illustrating the level-wise CP (L-CP) method for hierarchical OS fingerprinting. The system processes network traffic through feature engineering and applies separate MLP classifiers at each hierarchy level (family, major version, and minor version). Each classifier independently generates softmax outputs, which are then processed using Conformal Prediction to produce level-specific prediction sets. The figure shows how prediction sets are constructed independently at each level, with separate coverage guarantees, but without enforcing hierarchical consistency across levels.
  • Figure 4: Diagram illustrating the projection-based CP (P-CP) method for hierarchical OS fingerprinting. The system applies CP only at the leaf (minor version) level, generating a prediction set based on the MLP classifier's softmax outputs. The leaf-level prediction set is then projected upwards to coarser levels (major version and family) by including all ancestors of any retained leaf. This upward projection ensures hierarchical consistency, as shown by the nested structure where predictions at coarser levels are derived from the leaf-level set, guaranteeing that every predicted class corresponds to a valid path in the OS taxonomy.
  • Figure 5: Visual illustration of the two types of hierarchical violations that contribute to the Hierarchical Inconsistency Rate (HIR). The left diagram demonstrates an orphan violation, where a child node (Windows 11 - 23H2) is predicted without its parent (Windows 11) being in the prediction set. The right diagram shows a sterile violation, where a parent node (Windows 11) is predicted but none of its children are included in the prediction set. These violations indicate structural inconsistencies in hierarchical predictions that can undermine the reliability of automated decision-making systems.
  • ...and 5 more figures