Mini-Giants: "Small" Language Models and Open Source Win-Win

Zhengping Zhou; Lezhi Li; Xinxi Chen; Andy Li

Mini-Giants: "Small" Language Models and Open Source Win-Win

Zhengping Zhou, Lezhi Li, Xinxi Chen, Andy Li

TL;DR

This article argues that 'mini-giants'—open-source, instruction-following LLMs with around 10B parameters or fewer—offer a practical alternative to giant proprietary models due to adaptability, controllability, and affordability. It surveys parameter-reduction strategies (Chinchilla, LLaMA) and parameter-efficient fine-tuning methods (Adapter, Prefix, LoRA, QLoRA, ControlNet), and catalogs a spectrum of open-source LMs trained on synthetic or human-curated data (Alpaca, Vicuna, Dolly, Guanaco, Open Assistant). It emphasizes evaluation challenges and real-world applications emphasizing privacy and local computation, illustrated by CBT applications like Woebot. The paper concludes that open-source mini-giants can democratize AI access, enabling domain-specific adaptation and safer, governance-friendly deployments.

Abstract

ChatGPT is phenomenal. However, it is prohibitively expensive to train and refine such giant models. Fortunately, small language models are flourishing and becoming more and more competent. We call them "mini-giants". We argue that open source community like Kaggle and mini-giants will win-win in many ways, technically, ethically and socially. In this article, we present a brief yet rich background, discuss how to attain small language models, present a comparative study of small language models and a brief discussion of evaluation methods, discuss the application scenarios where small language models are most needed in the real world, and conclude with discussion and outlook.

Mini-Giants: "Small" Language Models and Open Source Win-Win

TL;DR

Abstract

Paper Structure (44 sections, 1 equation, 1 figure, 4 tables)

This paper contains 44 sections, 1 equation, 1 figure, 4 tables.

Introduction
A brief yet rich background
The Giants are fast
Language models as experts
Language and functional competence
Augmented LMs with tools
Mini-Giants are coming
Discussions & debates abound
How to make large foundation models "small"
Foundation models with reduced parameters
Chinchilla
LLaMa
Efficient fine-tuning strategies for foundation models
Adapter
Prefix fine-tuning
...and 29 more sections

Figures (1)

Figure 1: An evolution tree of recently released instruction-following small LMs. The color of the text boxes indicates the openness of the license under which the models are released: red stands for proprietary licenses, yellow stands for non-commercial licenses, and green stands for licenses permissive for commercial use.

Mini-Giants: "Small" Language Models and Open Source Win-Win

TL;DR

Abstract

Mini-Giants: "Small" Language Models and Open Source Win-Win

Authors

TL;DR

Abstract

Table of Contents

Figures (1)