Table of Contents
Fetching ...

Towards Lightweight, Adaptive and Attribute-Aware Multi-Aspect Controllable Text Generation with Large Language Models

Chenyu Zhu, Yefeng Liu, Chenyang Lyu, Xue Yang, Guanhua Chen, Longyue Wang, Weihua Luo, Kaifu Zhang

TL;DR

Experimental results show that the proposed lightweight, adaptive and attribute-aware framework for multi-aspect controllable text generation outperforms other strong baselines, achieves state-of-the-art performance, adapts well to data discrepancies, and is more accurate in attribute perception.

Abstract

Multi-aspect controllable text generation aims to control text generation in attributes from multiple aspects, making it a complex but powerful task in natural language processing. Supervised fine-tuning methods are often employed for this task due to their simplicity and effectiveness. However, they still have some limitations: low rank adaptation (LoRA) only fine-tunes a few parameters and has suboptimal control effects, while full fine-tuning (FFT) requires significant computational resources and is susceptible to overfitting, particularly when data is limited. Moreover, existing works typically train multi-aspect controllable text generation models using only single-aspect annotated data, which results in discrepancies in data distribution; at the same time, accurately generating text with specific attributes is a challenge that requires strong attribute-aware capabilities. To address these limitations, we propose a lightweight, adaptive and attribute-aware framework for multi-aspect controllable text generation. Our framework can dynamically adjust model parameters according to different aspects of data to achieve controllable text generation, aiming to optimize performance across multiple aspects. Experimental results show that our framework outperforms other strong baselines, achieves state-of-the-art performance, adapts well to data discrepancies, and is more accurate in attribute perception.

Towards Lightweight, Adaptive and Attribute-Aware Multi-Aspect Controllable Text Generation with Large Language Models

TL;DR

Experimental results show that the proposed lightweight, adaptive and attribute-aware framework for multi-aspect controllable text generation outperforms other strong baselines, achieves state-of-the-art performance, adapts well to data discrepancies, and is more accurate in attribute perception.

Abstract

Multi-aspect controllable text generation aims to control text generation in attributes from multiple aspects, making it a complex but powerful task in natural language processing. Supervised fine-tuning methods are often employed for this task due to their simplicity and effectiveness. However, they still have some limitations: low rank adaptation (LoRA) only fine-tunes a few parameters and has suboptimal control effects, while full fine-tuning (FFT) requires significant computational resources and is susceptible to overfitting, particularly when data is limited. Moreover, existing works typically train multi-aspect controllable text generation models using only single-aspect annotated data, which results in discrepancies in data distribution; at the same time, accurately generating text with specific attributes is a challenge that requires strong attribute-aware capabilities. To address these limitations, we propose a lightweight, adaptive and attribute-aware framework for multi-aspect controllable text generation. Our framework can dynamically adjust model parameters according to different aspects of data to achieve controllable text generation, aiming to optimize performance across multiple aspects. Experimental results show that our framework outperforms other strong baselines, achieves state-of-the-art performance, adapts well to data discrepancies, and is more accurate in attribute perception.

Paper Structure

This paper contains 37 sections, 11 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: Illustration of our proposed framework. Our framework extends the traditional LoRA by integrating multiple LoRA modules and employs a learnable gating function to dynamically combine multiple LoRA modules. We use the aspect identifier as the input of the gating function to learn unique parameters for each aspect. $X_{a^t_{\mu}}$ represents the input sequence containing attribute $a^t_{\mu}$ and $H_{a^t_{\mu}}$ is the output hidden state. Only the parameters of LoRAs and the gating function are updated during training.
  • Figure 2: Comparison of LoRA, FFT and our method in alleviating knowledge forgetting. With the injection of new knowledge, we measure the performance of these three methods on multi-aspect task. The performance of our method decreases the least, proving that our method is more robust to knowledge forgetting.
  • Figure 3: The visualization of LoRA weights for various aspects based on Qwen2-7B-Instruct.
  • Figure 4: The visualization of LoRA weights for various aspects based on Qwen2-72B-Instruct.
  • Figure 5: The visualization of LoRA weights for various aspects based on Llama-3.1-8B-Instruct.
  • ...and 4 more figures