Utilizing Large Language Models to Translate RFC Protocol Specifications to CPSA Definitions
Martin Duclos, Ivan A. Fernandez, Kaneesha Moore, Sudip Mittal, Edward Zieglar
TL;DR
The paper tackles the lack of formal verification in RFC protocol standards by enabling automated translation of English specifications into CPSA input. It introduces a code-specialized LLM-based RFC Proposal Translator that outputs CPSA-ready $s$-expressions, aided by pre- and post-processing and a targeted fine-tuning dataset. Empirical results using a 34B Code Llama model demonstrate that context from CPSA definitions is crucial for producing usable CPSA inputs, with domain-expert post-editing remaining necessary. The work promises to lower the barrier to formal analysis in protocol design and could improve internet protocol security by enabling earlier and more accessible formal verification.
Abstract
This paper proposes the use of Large Language Models (LLMs) for translating Request for Comments (RFC) protocol specifications into a format compatible with the Cryptographic Protocol Shapes Analyzer (CPSA). This novel approach aims to reduce the complexities and efforts involved in protocol analysis, by offering an automated method for translating protocol specifications into structured models suitable for CPSA. In this paper we discuss the implementation of an RFC Protocol Translator, its impact on enhancing the accessibility of formal methods analysis, and its potential for improving the security of internet protocols.
