Porting an LLM based Application from ChatGPT to an On-Premise Environment

Teemu Paloniemi; Manu Setälä; Tommi Mikkonen

Porting an LLM based Application from ChatGPT to an On-Premise Environment

Teemu Paloniemi, Manu Setälä, Tommi Mikkonen

TL;DR

The study investigates porting an LLM-based procurement assistant (AIPA) from ChatGPT to an on-premise environment to address privacy, security, customization, and cost concerns under EU regulations. It adopts a case study and Design Science Research to outline a three-step porting process (preparation, implementation, deployment/evaluation), including code changes, hardware choices, and model selection; it implements a local API via llama.cpp and applies LoRA-based fine-tuning, with server-side operation to mitigate data leakage. The findings indicate that on-prem porting is feasible, though performance may differ from the cloud baseline and may require additional training or larger models; the approach offers improved data localization and cost control, enabling industry adoption. Limitations include the single-case scope and lack of extensive long-term evaluation, prompting call for broader porting studies, concept-drift analysis, and API-level interoperability to replace LLMs with minimal code changes.

Abstract

Given the data-intensive nature of Machine Learning (ML) systems in general, and Large Language Models (LLM) in particular, using them in cloud based environments can become a challenge due to legislation related to privacy and security of data. Taking such aspects into consideration implies porting the LLMs to an on-premise environment, where privacy and security can be controlled. In this paper, we study this porting process of a real-life application using ChatGPT, which runs in a public cloud, to an on-premise environment. The application being ported is AIPA, a system that leverages Large Language Models (LLMs) and sophisticated data analytics to enhance the assessment of procurement call bids. The main considerations in the porting process include transparency of open source models and cost of hardware, which are central design choices of the on-premise environment. In addition to presenting the porting process, we evaluate downsides and benefits associated with porting.

Porting an LLM based Application from ChatGPT to an On-Premise Environment

TL;DR

Abstract

Porting an LLM based Application from ChatGPT to an On-Premise Environment

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)