Language Models as a Service: Overview of a New Paradigm and its Challenges
Emanuele La Malfa, Aleksandar Petrov, Simon Frieder, Christoph Weinhuber, Ryan Burnell, Raza Nazar, Anthony G. Cohn, Nigel Shadbolt, Michael Wooldridge
TL;DR
The paper analyzes the Language-Models-as-a-Service (LMaaS) paradigm, identifying four core challenges—accessibility, replicability, reliability, and trustworthiness—that arise from centralized, pay-per-use interfaces and limited model transparency. It surveys licensing landscapes, deployment practices, and evaluation issues, highlighting data and user contamination, non-determinism, and emergent behavior as key reliability and benchmarking obstacles. Through a synthesis of current knowledge and case studies, it offers a tentative, community-driven agenda with concrete recommendations for accessibility, legacy access, benchmarking, data provenance, and robust explainability. The work aims to guide researchers and providers toward LMaaS ecosystems that are more open to audit, reproducible under evolving deployments, and trustworthy in decision-making, including safety-critical contexts.
Abstract
Some of the most powerful language models currently are proprietary systems, accessible only via (typically restrictive) web or software programming interfaces. This is the Language-Models-as-a-Service (LMaaS) paradigm. In contrast with scenarios where full model access is available, as in the case of open-source models, such closed-off language models present specific challenges for evaluating, benchmarking, and testing them. This paper has two goals: on the one hand, we delineate how the aforementioned challenges act as impediments to the accessibility, replicability, reliability, and trustworthiness of LMaaS. We systematically examine the issues that arise from a lack of information about language models for each of these four aspects. We conduct a detailed analysis of existing solutions and put forth a number of considered recommendations, and highlight the directions for future advancements. On the other hand, it serves as a comprehensive resource for existing knowledge on current, major LMaaS, offering a synthesized overview of the licences and capabilities their interfaces offer.
