LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments

Zikun Ye; Hema Yoganarasimhan; Yufeng Zheng

LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments

Zikun Ye, Hema Yoganarasimhan, Yufeng Zheng

TL;DR

LOLA is introduced, a novel framework that integrates Large Language Models (LLMs) with adaptive experimentation to optimize content delivery and outperforms the standard A/B test method, pure bandit algorithms, and pure-LLM approaches, particularly in scenarios with limited experimental traffic.

Abstract

Modern media firms require automated and efficient methods to identify content that is most engaging and appealing to users. Leveraging a large-scale dataset from Upworthy (a news publisher), which includes 17,681 headline A/B tests, we first investigate the ability of three pure-LLM approaches to identify the catchiest headline: prompt-based methods, embedding-based methods, and fine-tuned open-source LLMs. Prompt-based approaches perform poorly, while both OpenAI-embedding-based models and the fine-tuned Llama-3-8B achieve marginally higher accuracy than random predictions. In sum, none of the pure-LLM-based methods can predict the best-performing headline with high accuracy. We then introduce the LLM-Assisted Online Learning Algorithm (LOLA), a novel framework that integrates Large Language Models (LLMs) with adaptive experimentation to optimize content delivery. LOLA combines the best pure-LLM approach with the Upper Confidence Bound algorithm to allocate traffic and maximize clicks adaptively. Our numerical experiments on Upworthy data show that LOLA outperforms the standard A/B test method (the current status quo at Upworthy), pure bandit algorithms, and pure-LLM approaches, particularly in scenarios with limited experimental traffic. Our approach is scalable and applicable to content experiments across various settings where firms seek to optimize user engagement, including digital advertising and social media recommendations.

LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments

TL;DR

Abstract

Paper Structure (38 sections, 8 equations, 14 figures, 11 tables, 5 algorithms)

This paper contains 38 sections, 8 equations, 14 figures, 11 tables, 5 algorithms.

Introduction
Upworthy Data and Experiments
Pure-LLM-Based Methods
Prompt-based Approaches
Zero-shot Prompting
In-context Learning
Performance Analysis of Prompt-based Approaches
CTR Prediction Models with LLM-based Text Embeddings
Performance Analysis of Embedding-based Approaches
Fine-Tuning Open-Source LLMs with LoRA
Performance Analysis of Fine-tuning-based Approaches
Summary of Performance of Pure-LLM Based Methods
Our Approach: LLM-Assisted Online Learning Algorithm
Bandit Framework
LOLA
...and 23 more sections

Figures (14)

Figure 1: Zero-Shot Prompting for Headline Selection.
Figure 2: In-Context Learning Prompt for Headline Selection.
Figure 3: The pipeline of the headline selection using LLM text embeddings. We use an A/B test with three headlines for illustration. Headlines 1, 2, and 3 are natural language sentences, while Embedding 1, 2, and 3 are numerical vectors.
Figure 4: Loss curve and accuracy curve as the number of training epochs increases.
Figure 5: Average clicks per experiment per period under different time horizon multipliers. Note that the Y-axis captures the average clicks per test per period. For instance, if there is a test with two headlines receiving 1 and 2 clicks, respectively, under $\tau=100$, then the average click per period in this test is calculated as $(1+2)/100=0.03$. The Y value is simply the average of this number $0.03$ over all tests. This measure scales well with the platform's total clicks in tests because headlines in different tests with different numbers of headlines take the same weight in this measure.
...and 9 more figures

LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments

TL;DR

Abstract

LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments

Authors

TL;DR

Abstract

Table of Contents

Figures (14)