FTSmartAudit: A Knowledge Distillation-Enhanced Framework for Automated Smart Contract Auditing Using Fine-Tuned LLMs
Zhiyuan Wei, Jing Sun, Zijian Zhang, Xianhao Zhang, Zhe Hou
TL;DR
This work tackles the challenge of efficiently auditing smart contracts with high privacy by training compact, fine-tuned LLMs. It introduces HKT-SmartAudit, a multi-stage distillation framework that transfers knowledge from large teachers to smaller students using direct labeling, expansion, expert, and self-knowledge distillation, complemented by reinforcement learning with a reward model. The approach yields HKT-vul and HKT-mix model variants that outperform traditional static analysis tools and larger models across standard and real-world vulnerability datasets, while reducing inference outputs and maintaining robustness. The framework demonstrates practical potential for secure, scalable smart contract auditing and offers extensibility to other domain-specific tasks requiring LLM-based analysis.
Abstract
The rapid growth of blockchain technology has driven the widespread adoption of smart contracts. However, their inherent vulnerabilities have led to significant financial losses. Traditional auditing methods, while essential, struggle to keep pace with the increasing complexity and scale of smart contracts. Large Language Models (LLMs) offer promising capabilities for automating vulnerability detection, but their adoption is often limited by high computational costs. Although prior work has explored leveraging large models through agents or workflows, relatively little attention has been given to improving the performance of smaller, fine-tuned models--a critical factor for achieving both efficiency and data privacy. In this paper, we introduce HKT-SmartAudit, a framework for developing lightweight models optimized for smart contract auditing. It features a multi-stage knowledge distillation pipeline that integrates classical distillation, external domain knowledge, and reward-guided learning to transfer high-quality insights from large teacher models. A single-task learning strategy is employed to train compact student models that maintain high accuracy and robustness while significantly reducing computational overhead. Experimental results show that our distilled models outperform both commercial tools and larger models in detecting complex vulnerabilities and logical flaws, offering a practical, secure, and scalable solution for smart contract auditing. The source code is available at Github repository.
