CPT: Consistent Proxy Tuning for Black-box Optimization

Yuanyang He; Zitong Huang; Xinxing Xu; Rick Siow Mong Goh; Salman Khan; Wangmeng Zuo; Yong Liu; Chun-Mei Feng

CPT: Consistent Proxy Tuning for Black-box Optimization

Yuanyang He, Zitong Huang, Xinxing Xu, Rick Siow Mong Goh, Salman Khan, Wangmeng Zuo, Yong Liu, Chun-Mei Feng

TL;DR

CPT addresses the mismatch between training and test-time in Proxy-tuning for black-box models by jointly leveraging a frozen large black-box model and a frozen white-box proxy during the training of a tunable white-box proxy. The method introduces a logit-offset scheme with a train-time factor $\alpha_{train}$ and a test-time factor $\alpha_{test}$, unifying the training objective and inference form when $\alpha_{train}=\alpha_{test}$. Empirically, CPT yields meaningful gains over Proxy-tuning on both LLM and VLM tasks, with mean accuracy improvements across multiple datasets and model scales, and remains robust under ablations such as different white-box tuning strategies and model sizes. The approach is model-agnostic and plug-and-play for logit-based tuning, offering a practical pathway to improve black-box models while enabling broader access to their capabilities without internal parameter access.

Abstract

Black-box tuning has attracted recent attention due to that the structure or inner parameters of advanced proprietary models are not accessible. Proxy-tuning provides a test-time output adjustment for tuning black-box language models. It applies the difference of the output logits before and after tuning a smaller white-box "proxy" model to improve the black-box model. However, this technique serves only as a decoding-time algorithm, leading to an inconsistency between training and testing which potentially limits overall performance. To address this problem, we introduce Consistent Proxy Tuning (CPT), a simple yet effective black-box tuning method. Different from Proxy-tuning, CPT additionally exploits the frozen large black-box model and another frozen small white-box model, ensuring consistency between training-stage optimization objective and test-time proxies. This consistency benefits Proxy-tuning and enhances model performance. Note that our method focuses solely on logit-level computation, which makes it model-agnostic and applicable to any task involving logit classification. Extensive experimental results demonstrate the superiority of our CPT in both black-box tuning of Large Language Models (LLMs) and Vision-Language Models (VLMs) across various datasets. The code is available at https://github.com/chunmeifeng/CPT.

CPT: Consistent Proxy Tuning for Black-box Optimization

TL;DR

and a test-time factor

, unifying the training objective and inference form when

. Empirically, CPT yields meaningful gains over Proxy-tuning on both LLM and VLM tasks, with mean accuracy improvements across multiple datasets and model scales, and remains robust under ablations such as different white-box tuning strategies and model sizes. The approach is model-agnostic and plug-and-play for logit-based tuning, offering a practical pathway to improve black-box models while enabling broader access to their capabilities without internal parameter access.

Abstract

Paper Structure (33 sections, 5 equations, 3 figures, 5 tables)

This paper contains 33 sections, 5 equations, 3 figures, 5 tables.

Introduction
Related Work
Efficient Fine-tuning.
Black-Box Tuning.
Logits Arithmetic.
Proposed Method
Revisiting Proxy-tuning
Consistent Proxy Tuning (CPT)
Extending CPT to Vision-Language Model
Experiments
Experimental Setup
Datasets.
Baselines.
Implementation Details.
Experimental Results
...and 18 more sections

Figures (3)

Figure 1: Illustration the comparison of our Consistent Proxy Tuning (CPT) with vanilla Proxy-tuning liu2024tuning. (a) and (b) respectively illustrate the training and inference stage of Proxy-tuning. Notice that their optimization objectives and the formula of the proxy during inference are inconsistent. In contrast, our CPT achieves consistency in these two aspects, as shown in (c). Especially, when $\alpha_{train} = 0$ and $\alpha_{test} = 1$, our CPT will degenerate into the "inconsistent" Proxy-tuning.
Figure 2: Variation of the accuracy versus the varied $\alpha_{train}$ and $\alpha_{test}$ on (b) ARC-challenge, (c) Stanford Cars, (d) Oxford-IIIT Pets and the results of their (a) Average .
Figure 3: Variation of the accuracy versus the widely varied $\alpha_{train}$ and $\alpha_{test}$ on (a) Oxford-IIIT Pets (b) and Stanford Cars .

CPT: Consistent Proxy Tuning for Black-box Optimization

TL;DR

Abstract

CPT: Consistent Proxy Tuning for Black-box Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (3)