Table of Contents
Fetching ...

Generation of Optimized Solidity Code for Machine Learning Models using LLMs

Nikumbh Sarthak Sham, Sandip Chakraborty, Shamik Sural

TL;DR

Deploying ML in public blockchains is challenging due to EVM constraints such as gas costs and lack of floating-point support. The authors present LMST, a workflow that translates off-chain PyTorch inference and weights to Solidity with GPT-4 and uses targeted prompts to minimize gas while preserving accuracy. In experiments on MNIST with a multi-layer perceptron, on-chain accuracy reached 81.1% (off-chain 86.0%), and gas measurements showed deployment ~1.34M gas and weight uploads ~73.7M gas, with stepwise optimization reducing deployment gas. The work demonstrates feasibility of verifiable, on-chain ML and outlines directions for scaling to more complex architectures and cross-chain deployment.

Abstract

While a plethora of machine learning (ML) models are currently available, along with their implementation on disparate platforms, there is hardly any verifiable ML code which can be executed on public blockchains. We propose a novel approach named LMST that enables conversion of the inferencing path of an ML model as well as its weights trained off-chain into Solidity code using Large Language Models (LLMs). Extensive prompt engineering is done to achieve gas cost optimization beyond mere correctness of the produced code, while taking into consideration the capabilities and limitations of the Ethereum Virtual Machine. We have also developed a proof of concept decentralized application using the code so generated for verifying the accuracy claims of the underlying ML model. An extensive set of experiments demonstrate the feasibility of deploying ML models on blockchains through automated code translation using LLMs.

Generation of Optimized Solidity Code for Machine Learning Models using LLMs

TL;DR

Deploying ML in public blockchains is challenging due to EVM constraints such as gas costs and lack of floating-point support. The authors present LMST, a workflow that translates off-chain PyTorch inference and weights to Solidity with GPT-4 and uses targeted prompts to minimize gas while preserving accuracy. In experiments on MNIST with a multi-layer perceptron, on-chain accuracy reached 81.1% (off-chain 86.0%), and gas measurements showed deployment ~1.34M gas and weight uploads ~73.7M gas, with stepwise optimization reducing deployment gas. The work demonstrates feasibility of verifiable, on-chain ML and outlines directions for scaling to more complex architectures and cross-chain deployment.

Abstract

While a plethora of machine learning (ML) models are currently available, along with their implementation on disparate platforms, there is hardly any verifiable ML code which can be executed on public blockchains. We propose a novel approach named LMST that enables conversion of the inferencing path of an ML model as well as its weights trained off-chain into Solidity code using Large Language Models (LLMs). Extensive prompt engineering is done to achieve gas cost optimization beyond mere correctness of the produced code, while taking into consideration the capabilities and limitations of the Ethereum Virtual Machine. We have also developed a proof of concept decentralized application using the code so generated for verifying the accuracy claims of the underlying ML model. An extensive set of experiments demonstrate the feasibility of deploying ML models on blockchains through automated code translation using LLMs.

Paper Structure

This paper contains 7 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: LMST Overall Process Workflow
  • Figure 2: Prompt Engineering Steps for Optimization Flow in LMST
  • Figure 3: Optimization Step vs Gas Consumed
  • Figure 4: Model Complexity vs Gas Consumed