Generation of Optimized Solidity Code for Machine Learning Models using LLMs
Nikumbh Sarthak Sham, Sandip Chakraborty, Shamik Sural
TL;DR
Deploying ML in public blockchains is challenging due to EVM constraints such as gas costs and lack of floating-point support. The authors present LMST, a workflow that translates off-chain PyTorch inference and weights to Solidity with GPT-4 and uses targeted prompts to minimize gas while preserving accuracy. In experiments on MNIST with a multi-layer perceptron, on-chain accuracy reached 81.1% (off-chain 86.0%), and gas measurements showed deployment ~1.34M gas and weight uploads ~73.7M gas, with stepwise optimization reducing deployment gas. The work demonstrates feasibility of verifiable, on-chain ML and outlines directions for scaling to more complex architectures and cross-chain deployment.
Abstract
While a plethora of machine learning (ML) models are currently available, along with their implementation on disparate platforms, there is hardly any verifiable ML code which can be executed on public blockchains. We propose a novel approach named LMST that enables conversion of the inferencing path of an ML model as well as its weights trained off-chain into Solidity code using Large Language Models (LLMs). Extensive prompt engineering is done to achieve gas cost optimization beyond mere correctness of the produced code, while taking into consideration the capabilities and limitations of the Ethereum Virtual Machine. We have also developed a proof of concept decentralized application using the code so generated for verifying the accuracy claims of the underlying ML model. An extensive set of experiments demonstrate the feasibility of deploying ML models on blockchains through automated code translation using LLMs.
