Table of Contents
Fetching ...

Controlling Large Language Model with Latent Actions

Chengxing Jia, Ziniu Li, Pengyuan Wang, Yi-Chen Li, Zhenyu Hou, Yuxiao Dong, Yang Yu

TL;DR

CoLA introduces a latent-action framework to control large language models with a compact action space learned via an inverse dynamics model. The architecture couples a language world model with a discrete latent action codebook and a policy, enabling efficient RL and improved downstream performance while preserving the base model's capabilities. Experiments on math reasoning, agent tasks, and diverse prompts show higher semantic diversity, stronger math performance ($42.4$ on math500 vs $38.2$) and a peak $68.2$ with MCTS variants, along with improved robustness to reward hacking. The results suggest latent-action control offers a scalable path to more controllable, sample-efficient RL-based adaptation of LLMs for practical applications.

Abstract

Adapting Large Language Models (LLMs) to downstream tasks using Reinforcement Learning (RL) has proven to be an effective approach. However, LLMs do not inherently define the structure of an agent for RL training, particularly in terms of defining the action space. This paper studies learning a compact latent action space to enhance the controllability and exploration of RL for LLMs. We propose Controlling Large Language Models with Latent Actions (CoLA), a framework that integrates a latent action space into pre-trained LLMs. We apply CoLA to the Llama-3.1-8B model. Our experiments demonstrate that, compared to RL with token-level actions, CoLA's latent action enables greater semantic diversity in text generation. For enhancing downstream tasks, we show that CoLA with RL achieves a score of 42.4 on the math500 benchmark, surpassing the baseline score of 38.2, and reaches 68.2 when augmented with a Monte Carlo Tree Search variant. Furthermore, CoLA with RL consistently improves performance on agent-based tasks without degrading the pre-trained LLM's capabilities, unlike the baseline. Finally, CoLA reduces computation time by half in tasks involving enhanced thinking prompts for LLMs by RL. These results highlight CoLA's potential to advance RL-based adaptation of LLMs for downstream applications.

Controlling Large Language Model with Latent Actions

TL;DR

CoLA introduces a latent-action framework to control large language models with a compact action space learned via an inverse dynamics model. The architecture couples a language world model with a discrete latent action codebook and a policy, enabling efficient RL and improved downstream performance while preserving the base model's capabilities. Experiments on math reasoning, agent tasks, and diverse prompts show higher semantic diversity, stronger math performance ( on math500 vs ) and a peak with MCTS variants, along with improved robustness to reward hacking. The results suggest latent-action control offers a scalable path to more controllable, sample-efficient RL-based adaptation of LLMs for practical applications.

Abstract

Adapting Large Language Models (LLMs) to downstream tasks using Reinforcement Learning (RL) has proven to be an effective approach. However, LLMs do not inherently define the structure of an agent for RL training, particularly in terms of defining the action space. This paper studies learning a compact latent action space to enhance the controllability and exploration of RL for LLMs. We propose Controlling Large Language Models with Latent Actions (CoLA), a framework that integrates a latent action space into pre-trained LLMs. We apply CoLA to the Llama-3.1-8B model. Our experiments demonstrate that, compared to RL with token-level actions, CoLA's latent action enables greater semantic diversity in text generation. For enhancing downstream tasks, we show that CoLA with RL achieves a score of 42.4 on the math500 benchmark, surpassing the baseline score of 38.2, and reaches 68.2 when augmented with a Monte Carlo Tree Search variant. Furthermore, CoLA with RL consistently improves performance on agent-based tasks without degrading the pre-trained LLM's capabilities, unlike the baseline. Finally, CoLA reduces computation time by half in tasks involving enhanced thinking prompts for LLMs by RL. These results highlight CoLA's potential to advance RL-based adaptation of LLMs for downstream applications.

Paper Structure

This paper contains 41 sections, 10 equations, 14 figures, 2 tables, 4 algorithms.

Figures (14)

  • Figure 1: An illustration of latent action control in CoLA. The left is the naive decoder-only inference pipeline; and the right is the pipeline of CoLA.
  • Figure 2: The diversity value. The blue line is the diversity of random latent action sampling. The yellow line is the diversity value of the base model, and the green one is that of random token sampling. The red line is the random action sampling diversity scaling from 1B to 10B pre-training tokens.
  • Figure 3: Performance of math reasoning. The blue line is the CoLA model, and the yellow line is the baseline. (a) Performance on reasoning benchmarks. (b) Performance of pass@K on math500.
  • Figure 4: Performance of Countdown Game. The blue line is the CoLA model, and the yellow line is the baseline. (a) Curves of Format Reward. (b) Curves of Response Length.
  • Figure 5: GPT-4 win rate in distinct preferences. ACA means academy, BUS means business, ENT means entertainment and LIT means literature. KL COEF is the KL coefficient. AVERAGE is the average of four tasks. The value larger than 50 means a better alignment. (a) win rate of CoLA relative to baseline. (b) win rate of CoLA with KL coefficient 0.00 relative to that with 0.01.
  • ...and 9 more figures