GenAIOps for GenAI Model-Agility
Ken Ueno, Makoto Kogo, Hiromi Kawatsu, Yohsuke Uchiumi, Michiaki Tatsubori
TL;DR
GenAIOps defines a comprehensive GenAI application development and operation framework to maintain agility when foundation models change. The paper surveys and validates optimization methods—prompt tuning, auto prompt-engineering, and few-shot learning—within an evaluation framework that covers accuracy, safety, and fairness, and discusses pitfalls when combining methods. Empirical results show prompt tuning can improve performance with enough training data, while automatic prompt generation may not outperform human prompts and model-specific adaptation remains necessary; few-shot can help untuned models but may hurt tuned ones. The work highlights the practical need for CI/CD-ready tooling and model-compatibility strategies to realize GenAIOps in real-world GenAI deployments.
Abstract
AI-agility, with which an organization can be quickly adapted to its business priorities, is desired even for the development and operations of generative AI (GenAI) applications. Especially in this paper, we discuss so-called GenAI Model-agility, which we define as the readiness to be flexibly adapted to base foundation models as diverse as the model providers and versions. First, for handling issues specific to generative AI, we first define a methodology of GenAI application development and operations, as GenAIOps, to identify the problem of application quality degradation caused by changes to the underlying foundation models. We study prompt tuning technologies, which look promising to address this problem, and discuss their effectiveness and limitations through case studies using existing tools.
