Table of Contents
Fetching ...

ELPO: Ensemble Learning Based Prompt Optimization for Large Language Models

Qing Zhang, Bing Xu, Xudong Zhang, Yifan Shi, Yang Li, Chen Zhang, Yik Chung Wu, Ngai Wong, Yijie Chen, Hong Dai, Xiansen Chen, Mian Zhang

TL;DR

ELPO introduces Ensemble Learning Based Prompt Optimization for large language models, addressing the fragility of manual and single-method APO by integrating multiple generation models, Bayesian search, and MAB search with ensemble voting. The framework aims to produce more accurate and robust prompts under black-box API constraints, leveraging diverse generation strategies such as Hard-Case Tracking. Empirical results on six tasks show ELPO outperforms state-of-the-art prompt optimizers, including sizable gains on ArSarcasm, BBH, WSC, and GSM8K. This work demonstrates the effectiveness of ensemble strategies in APO and highlights directions for richer prompt generation and more efficient search.

Abstract

The remarkable performance of Large Language Models (LLMs) highly relies on crafted prompts. However, manual prompt engineering is a laborious process, creating a core bottleneck for practical application of LLMs. This phenomenon has led to the emergence of a new research area known as Automatic Prompt Optimization (APO), which develops rapidly in recent years. Existing APO methods such as those based on evolutionary algorithms or trial-and-error approaches realize an efficient and accurate prompt optimization to some extent. However, those researches focus on a single model or algorithm for the generation strategy and optimization process, which limits their performance when handling complex tasks. To address this, we propose a novel framework called Ensemble Learning based Prompt Optimization (ELPO) to achieve more accurate and robust results. Motivated by the idea of ensemble learning, ELPO conducts voting mechanism and introduces shared generation strategies along with different search methods for searching superior prompts. Moreover, ELPO creatively presents more efficient algorithms for the prompt generation and search process. Experimental results demonstrate that ELPO outperforms state-of-the-art prompt optimization methods across different tasks, e.g., improving F1 score by 7.6 on ArSarcasm dataset.

ELPO: Ensemble Learning Based Prompt Optimization for Large Language Models

TL;DR

ELPO introduces Ensemble Learning Based Prompt Optimization for large language models, addressing the fragility of manual and single-method APO by integrating multiple generation models, Bayesian search, and MAB search with ensemble voting. The framework aims to produce more accurate and robust prompts under black-box API constraints, leveraging diverse generation strategies such as Hard-Case Tracking. Empirical results on six tasks show ELPO outperforms state-of-the-art prompt optimizers, including sizable gains on ArSarcasm, BBH, WSC, and GSM8K. This work demonstrates the effectiveness of ensemble strategies in APO and highlights directions for richer prompt generation and more efficient search.

Abstract

The remarkable performance of Large Language Models (LLMs) highly relies on crafted prompts. However, manual prompt engineering is a laborious process, creating a core bottleneck for practical application of LLMs. This phenomenon has led to the emergence of a new research area known as Automatic Prompt Optimization (APO), which develops rapidly in recent years. Existing APO methods such as those based on evolutionary algorithms or trial-and-error approaches realize an efficient and accurate prompt optimization to some extent. However, those researches focus on a single model or algorithm for the generation strategy and optimization process, which limits their performance when handling complex tasks. To address this, we propose a novel framework called Ensemble Learning based Prompt Optimization (ELPO) to achieve more accurate and robust results. Motivated by the idea of ensemble learning, ELPO conducts voting mechanism and introduces shared generation strategies along with different search methods for searching superior prompts. Moreover, ELPO creatively presents more efficient algorithms for the prompt generation and search process. Experimental results demonstrate that ELPO outperforms state-of-the-art prompt optimization methods across different tasks, e.g., improving F1 score by 7.6 on ArSarcasm dataset.

Paper Structure

This paper contains 19 sections, 2 figures, 4 tables, 6 algorithms.

Figures (2)

  • Figure 1: Pipeline of ELPO.
  • Figure 2: Efficiency of search.