Table of Contents
Fetching ...

Boosting Large Language Models with Continual Learning for Aspect-based Sentiment Analysis

Xuanwen Ding, Jie Zhou, Liang Dou, Qin Chen, Yuanbin Wu, Chengcai Chen, Liang He

TL;DR

This work tackles continual learning for aspect-based sentiment analysis (ABSA) with large language models by decoupling domain knowledge into invariant and variant components using an orthogonal constraint, and by aligning these representations through a domain knowledge warmup. It introduces domain positioning to select the appropriate domain-variant adapter at test time without domain IDs, and implements an instruction-tuned LLM ABSA model with LoRA adapters. Across 19 ABSA datasets, the proposed LLM-CL achieves state-of-the-art results on ABSC, AE, and JOINT tasks, while mitigating catastrophic forgetting compared to existing baselines. The approach demonstrates the viability and benefits of combining LLMs with structured continual-learning strategies for cross-domain ABSA and suggests directions for extending the framework to broader cross-domain continual-learning tasks.

Abstract

Aspect-based sentiment analysis (ABSA) is an important subtask of sentiment analysis, which aims to extract the aspects and predict their sentiments. Most existing studies focus on improving the performance of the target domain by fine-tuning domain-specific models (trained on source domains) based on the target domain dataset. Few works propose continual learning tasks for ABSA, which aim to learn the target domain's ability while maintaining the history domains' abilities. In this paper, we propose a Large Language Model-based Continual Learning (\texttt{LLM-CL}) model for ABSA. First, we design a domain knowledge decoupling module to learn a domain-invariant adapter and separate domain-variant adapters dependently with an orthogonal constraint. Then, we introduce a domain knowledge warmup strategy to align the representation between domain-invariant and domain-variant knowledge. In the test phase, we index the corresponding domain-variant knowledge via domain positioning to not require each sample's domain ID. Extensive experiments over 19 datasets indicate that our \texttt{LLM-CL} model obtains new state-of-the-art performance.

Boosting Large Language Models with Continual Learning for Aspect-based Sentiment Analysis

TL;DR

This work tackles continual learning for aspect-based sentiment analysis (ABSA) with large language models by decoupling domain knowledge into invariant and variant components using an orthogonal constraint, and by aligning these representations through a domain knowledge warmup. It introduces domain positioning to select the appropriate domain-variant adapter at test time without domain IDs, and implements an instruction-tuned LLM ABSA model with LoRA adapters. Across 19 ABSA datasets, the proposed LLM-CL achieves state-of-the-art results on ABSC, AE, and JOINT tasks, while mitigating catastrophic forgetting compared to existing baselines. The approach demonstrates the viability and benefits of combining LLMs with structured continual-learning strategies for cross-domain ABSA and suggests directions for extending the framework to broader cross-domain continual-learning tasks.

Abstract

Aspect-based sentiment analysis (ABSA) is an important subtask of sentiment analysis, which aims to extract the aspects and predict their sentiments. Most existing studies focus on improving the performance of the target domain by fine-tuning domain-specific models (trained on source domains) based on the target domain dataset. Few works propose continual learning tasks for ABSA, which aim to learn the target domain's ability while maintaining the history domains' abilities. In this paper, we propose a Large Language Model-based Continual Learning (\texttt{LLM-CL}) model for ABSA. First, we design a domain knowledge decoupling module to learn a domain-invariant adapter and separate domain-variant adapters dependently with an orthogonal constraint. Then, we introduce a domain knowledge warmup strategy to align the representation between domain-invariant and domain-variant knowledge. In the test phase, we index the corresponding domain-variant knowledge via domain positioning to not require each sample's domain ID. Extensive experiments over 19 datasets indicate that our \texttt{LLM-CL} model obtains new state-of-the-art performance.
Paper Structure (24 sections, 9 equations, 3 figures, 4 tables)

This paper contains 24 sections, 9 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Continual learning for a sequence of ABSA domains. The blue color is domain-invariant knowledge, and the other is domain-variant knowledge.
  • Figure 2: The framework of our LLM-CL.
  • Figure 3: Catastrophic forgetting of LLM. The x-axis represents the test results for the corresponding domain. The y-axis represents the direction of the training domain from bottom to top. The subgraph in the upper right corner represents the gap between each method in each training domain. The depth of the color in the grid indicates how well the LLM performs on the corresponding test set during the continual learning process.