Table of Contents
Fetching ...

Cryptoscope: Analyzing cryptographic usages in modern software

Micha Moffie, Omer Boehm, Anatoly Koyfman, Eyal Bin, Efrayim Sztokman, Sukanta Bhattacharjee, Meghnath Saha, James McGugan

TL;DR

Cryptoscope tackles the challenge of identifying and modernizing cryptographic usage across large, multilingual code bases in the context of quantum-threats. It introduces a static-analysis pipeline that builds a language-agnostic inventory of cryptographic assets (CBOM-aligned) by tracing cryptographic parameters through control and data flow, anchored in a flexible knowledge base. The approach achieves high asset-discovery accuracy ($92\%$ exact, $98\%$ including partial) and strong vulnerability detection performance (including all crypto-related issues in CamBench), while maintaining practical offline runtimes (~$1650$ lines/s). This work demonstrates a scalable, language-independent framework for identifying cryptographic usage and weaknesses, enabling effective remediation prioritization and policy enforcement in real-world software ecosystems.

Abstract

The advent of quantum computing poses a significant challenge as it has the potential to break certain cryptographic algorithms, necessitating a proactive approach to identify and modernize cryptographic code. Identifying these cryptographic elements in existing code is only the first step. It is crucial not only to identify quantum vulnerable algorithms but also to detect vulnerabilities and incorrect crypto usages, to prioritize, report, monitor as well as remediate and modernize code bases. A U.S. government memorandum require agencies to begin their transition to PQC (Post Quantum Cryptograpy) by conducting a prioritized inventory of cryptographic systems including software and hardware systems. In this paper we describe our code scanning tool - Cryptoscope - which leverages cryptographic domain knowledge as well as compiler techniques to statically parse and analyze source code. By analyzing control and data flow the tool is able to build an extendable and querriable inventory of cryptography. Cryptoscope goes beyond identifying disconnected cryptographic APIs and instead provides the user with an inventory of cryptographic assets - containing comprehensive views of the cryptographic operations implemented. We show that for more than 92% of our test cases, these views include the cryptographic operation itself, APIs, as well as the related material such as keys, nonces, random sources etc. Lastly, building on top of this inventory, our tool is able to detect and report all the cryptographic related weaknesses and vulnerabilities (11 out of 15) in CamBench - achieving state-of-the-art performance.

Cryptoscope: Analyzing cryptographic usages in modern software

TL;DR

Cryptoscope tackles the challenge of identifying and modernizing cryptographic usage across large, multilingual code bases in the context of quantum-threats. It introduces a static-analysis pipeline that builds a language-agnostic inventory of cryptographic assets (CBOM-aligned) by tracing cryptographic parameters through control and data flow, anchored in a flexible knowledge base. The approach achieves high asset-discovery accuracy ( exact, including partial) and strong vulnerability detection performance (including all crypto-related issues in CamBench), while maintaining practical offline runtimes (~ lines/s). This work demonstrates a scalable, language-independent framework for identifying cryptographic usage and weaknesses, enabling effective remediation prioritization and policy enforcement in real-world software ecosystems.

Abstract

The advent of quantum computing poses a significant challenge as it has the potential to break certain cryptographic algorithms, necessitating a proactive approach to identify and modernize cryptographic code. Identifying these cryptographic elements in existing code is only the first step. It is crucial not only to identify quantum vulnerable algorithms but also to detect vulnerabilities and incorrect crypto usages, to prioritize, report, monitor as well as remediate and modernize code bases. A U.S. government memorandum require agencies to begin their transition to PQC (Post Quantum Cryptograpy) by conducting a prioritized inventory of cryptographic systems including software and hardware systems. In this paper we describe our code scanning tool - Cryptoscope - which leverages cryptographic domain knowledge as well as compiler techniques to statically parse and analyze source code. By analyzing control and data flow the tool is able to build an extendable and querriable inventory of cryptography. Cryptoscope goes beyond identifying disconnected cryptographic APIs and instead provides the user with an inventory of cryptographic assets - containing comprehensive views of the cryptographic operations implemented. We show that for more than 92% of our test cases, these views include the cryptographic operation itself, APIs, as well as the related material such as keys, nonces, random sources etc. Lastly, building on top of this inventory, our tool is able to detect and report all the cryptographic related weaknesses and vulnerabilities (11 out of 15) in CamBench - achieving state-of-the-art performance.

Paper Structure

This paper contains 17 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Illustrating the expected input and output to clarify challenge: first, identify the implementation of cryptographic operations in source code across different programming languages in a generic manner, second, represent the complete operational semantics in a unified way.
  • Figure 2: The use cases that are supported by a complete and unified organizational cryptographic inventory.
  • Figure 3: Crypto-asset discovery. The four stages of the static analysis pipeline: parsing the code, analyzing control and data flow, building slices based on crypto relevant criteria and finally constructing crypto assets for each slice.
  • Figure 4: vulnerability identification flow Crypto-asset discovery. The four stages of the static analysis pipeline: parsing the code, analyzing control and data flow, building slices based on crypto relevant criteria and finally constructing crypto assets for each slice.