Plaintext Structure Vulnerability: Robust Cipher Identification via a Distributional Randomness Fingerprint Feature Extractor
Xiwen Ren, Min Luo, Cong Peng, Debiao He
TL;DR
The paper tackles ciphertext-only cipher identification under varying plaintext structures, a scenario where traditional models overfit to plaintext statistics. It proposes a robust distributional randomness fingerprint that converts a fixed suite of randomness tests into calibrated, windowed distributional features (histograms and moments) and assembles them into fingerprints for standard classifiers. Empirical results show high discriminative performance on Canterbury data and strong cross-domain robustness, with minimal degradation even as plaintext structure shifts, coupled with a mechanistic explanation via distributional separability and divergence analyses. The work highlights plaintext variation as a critical factor in security research, proposes a model-agnostic feature representation, and demonstrates practical viability alongside clear directions for future work and broader application. Overall, robustness to plaintext variation should be a primary evaluation criterion for ciphertext-based identification systems.
Abstract
Modern encryption algorithms form the foundation of digital security. However, the widespread use of encryption algorithms results in significant challenges for network defenders in identifying which specific algorithms are being employed. More importantly, we find that when the plaintext distribution of test data departs from the training data, the performance of classifiers often declines significantly. This issue exposes the feature extractor's hidden dependency on plaintext features. To reduce this dependency, we adopt a method that does not learn end-to-end from ciphertext bytes. Specifically, this method is based on a set of statistical tests to compute the randomness feature of the ciphertext, and then uses the frequency distribution pattern of this feature to construct the algorithms' respective fingerprints. The experimental results demonstrate that our method achieves high discriminative performance (e.g., AUC > 0.98) in the Canterbury Corpus dataset, which contains a diverse set of data types. Furthermore, in our cross-domain evaluation, baseline models' performance degrades significantly when tested on data with a reduced proportion of structured plaintext. In sharp contrast, our method demonstrates high robustness: performance degradation is minimal when transferring between different structured domains, and even on the most challenging purely random dataset, it maintains a high level of ranking ability (AUC > 0.90).
