ParsEval: Evaluation of Parsing Behavior using Real-world Out-in-the-wild X.509 Certificates
Stefan Tatschner, Sebastian N. Peters, Michael P. Heinl, Tobias Specht, Thomas Newe
TL;DR
The paper examines user-visible and security-related differences among X.509 certificate parsers by running real-world certificates through six production-ready libraries and comparing results. Using the 11.6 TiB Censys dataset, it applies a large-scale, differential parsing approach and an explicit error taxonomy to quantify performance, error rates, and failure modes. Key findings reveal pronounced disparities, with wolfSSL and Mbed TLS exhibiting about 13% parsing failures, OpenSSL showing near-zero failures, and diverse error patterns across implementations, including notable regressions in wolfSSL. The work highlights the need for standardized, automated testing to improve interoperability and security in the heterogeneous X.509 ecosystem, and provides a foundation for future investigations into the security implications of parser differences.
Abstract
X.509 certificates play a crucial role in establishing secure communication over the internet by enabling authentication and data integrity. Equipped with a rich feature set, the X.509 standard is defined by multiple, comprehensive ISO/IEC documents. Due to its internet-wide usage, there are different implementations in multiple programming languages leading to a large and fragmented ecosystem. This work addresses the research question "Are there user-visible and security-related differences between X.509 certificate parsers?". Relevant libraries offering APIs for parsing X.509 certificates were investigated and an appropriate test suite was developed. From 34 libraries 6 were chosen for further analysis. The X.509 parsing modules of the chosen libraries were called with 186,576,846 different certificates from a real-world dataset and the observed error codes were investigated. This study reveals an anomaly in wolfSSL's X.509 parsing module and that there are fundamental differences in the ecosystem. While related studies nowadays mostly focus on fuzzing techniques resulting in artificial certificates, this study confirms that available X.509 parsing modules differ largely and yield different results, even for real-world out-in-the-wild certificates.
