Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

Kairun Zhang; Haoyu Li; Yanjun Zhao; Yifan Sun; Huan Zhang

Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

Kairun Zhang, Haoyu Li, Yanjun Zhao, Yifan Sun, Huan Zhang

TL;DR

This paper tackles the memory demands of fine-tuning large language models by using a learning-to-learn zeroth-order optimizer. It introduces ZO Fine-tuner, which learns adaptive, per-block perturbation variances to guide gradient-free updates, leveraging the block-diagonal Hessian structure of LLMs. The method trains once on a base model and transfers to derivatives and diverse downstream tasks, achieving 82.1% wins and an average 2.5% accuracy improvement across 28 task-model pairs with minimal overhead. This work offers a practical path toward memory-efficient fine-tuning at the foundation-model scale by combining L2L with compact perturbation learning.

Abstract

Zeroth-order optimizers have recently emerged as a practical approach for fine-tuning large language models (LLMs), significantly reducing GPU memory consumption compared to traditional first-order methods. Yet, existing zeroth-order methods rely on hand-crafted, static sampling strategies that are not adaptable to model-specific structures. To address this, we propose ZO Fine-tuner, a learning-based zeroth-order optimizer for LLMs that automatically learns efficient perturbation strategies through a compact and memory-efficient design. Crucially, our approach is motivated by the observation that only a small number of foundation models and their derivatives are widely adopted in practice. Therefore, learning the optimizer once for a given LLM and reusing it across diverse downstream tasks is both feasible and highly desirable. Accordingly, ZO Fine-tuner is designed to scale learning to learn (L2L) to the foundation-model era by supporting one-time training per LLM with minimal overhead. Experiments on 4 LLMs and 7 datasets show that ZO Fine-tuner outperforms prior zeroth-order baselines in 82.1\% of task-model combinations, thereby demonstrating strong performance and scalability for efficient LLM fine-tuning. Our code is available at https://github.com/ASTRAL-Group/ZO_Fine_tuner.git.

Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

TL;DR

Abstract

Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (5)