Resource-Efficient LLM Application for Structured Transformation of Unstructured Financial Contracts
Maruf Ahmed Mridul, Oshani Seneviratne
TL;DR
This work tackles the challenge of converting unstructured Credit Support Annexes into CDM representations by extending the CDMizer, a template-driven framework that guarantees CDM-schema adherence. It evaluates a resource-efficient approach using the open-source Qwen3-30B LLM, with and without Retrieval-Augmented Generation, on ISDA CSA benchmark data. The study demonstrates that CDMizer can achieve competitive accuracy and strong syntactic correctness while operating with substantially fewer resources than proprietary models. The findings support a scalable, privacy-preserving pathway for institutional contract digitization, especially for organizations with restricted data access or limited compute budgets.
Abstract
The transformation of unstructured legal contracts into standardized, machine-readable formats is essential for automating financial workflows. The Common Domain Model (CDM) provides a standardized framework for this purpose, but converting complex legal documents like Credit Support Annexes (CSAs) into CDM representations remains a significant challenge. In this paper, we present an extension of the CDMizer framework, a template-driven solution that ensures syntactic correctness and adherence to the CDM schema during contract-to-CDM conversion. We apply this extended framework to a real-world task, comparing its performance with a benchmark developed by the International Swaps and Derivatives Association (ISDA) for CSA clause extraction. Our results show that CDMizer, when integrated with a significantly smaller, open-source Large Language Model (LLM), achieves competitive performance in terms of accuracy and efficiency against larger, proprietary models. This work underscores the potential of resource-efficient solutions to automate legal contract transformation, offering a cost-effective and scalable approach that can meet the needs of financial institutions with constrained resources or strict data privacy requirements.
