Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies

Benjue Weng

Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies

Benjue Weng

TL;DR

This paper surveys how Transformer-based large language models are fine-tuned across paradigms and tasks, from in-context learning and chain-of-thought to agent-based and retrieval-augmented methods. It provides a comprehensive taxonomy of fine-tuning strategies, emphasizing parameter-efficient approaches such as LoRA, Prefix-, and Adapter-tuning, along with instruction- and alignment-tuning (RLHF, DPO, NLHF). Empirical results on six text-classification benchmarks highlight the competitiveness of PEFT techniques, particularly LoRA, while showing the impact of model size and data regime on performance. The work offers practical guidance for deploying scalable, efficient fine-tuning in industry and research, outlining challenges and future directions in this rapidly evolving landscape.

Abstract

With the surge of ChatGPT,the use of large models has significantly increased,rapidly rising to prominence across the industry and sweeping across the internet. This article is a comprehensive review of fine-tuning methods for large models. This paper investigates the latest technological advancements and the application of advanced methods in aspects such as task-adaptive fine-tuning,domain-adaptive fine-tuning,few-shot learning,knowledge distillation,multi-task learning,parameter-efficient fine-tuning,and dynamic fine-tuning.

Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies

TL;DR

Abstract

Paper Structure (49 sections, 14 equations, 59 figures, 2 tables, 2 algorithms)

This paper contains 49 sections, 14 equations, 59 figures, 2 tables, 2 algorithms.

Introduction
RELATED WORK
Evolution of Transformer-Based Models
In-Context Learning and Prompt Engineering
Fine-Tuning in Large Language Models
Transformer Architecture
Attention Mechanism
Encoder and Decoder Blocks
Positional Encoding
Multi-Head Attention
Layer Normalization and Residual Connections
LLMs Paradigm
LLMs family
In-Context Learning (ICL)
Chain-of-Thought Reasoning (CoT)
...and 34 more sections

Figures (59)

Figure 1: Finetuning a pretrained LLM to follow instructions FintuingLLM.
Figure 2: The generic teacher-student framework for knowledge distillation. Gou2020KnowledgeDA
Figure 3: The Transformer-model architecture. Vaswani2017AttentionIA
Figure 4: (left) Scaled Dot-Product Attention. (right) Multi-Head Attention consists of several attention layers running in parallel. Vaswani2017AttentionIA
Figure 5: transformer decoder output softmax. tranformer_understand
...and 54 more figures

Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies

TL;DR

Abstract

Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies

Authors

TL;DR

Abstract

Table of Contents

Figures (59)