Boosting Large Language Models with Continual Learning for Aspect-based Sentiment Analysis
Xuanwen Ding, Jie Zhou, Liang Dou, Qin Chen, Yuanbin Wu, Chengcai Chen, Liang He
TL;DR
This work tackles continual learning for aspect-based sentiment analysis (ABSA) with large language models by decoupling domain knowledge into invariant and variant components using an orthogonal constraint, and by aligning these representations through a domain knowledge warmup. It introduces domain positioning to select the appropriate domain-variant adapter at test time without domain IDs, and implements an instruction-tuned LLM ABSA model with LoRA adapters. Across 19 ABSA datasets, the proposed LLM-CL achieves state-of-the-art results on ABSC, AE, and JOINT tasks, while mitigating catastrophic forgetting compared to existing baselines. The approach demonstrates the viability and benefits of combining LLMs with structured continual-learning strategies for cross-domain ABSA and suggests directions for extending the framework to broader cross-domain continual-learning tasks.
Abstract
Aspect-based sentiment analysis (ABSA) is an important subtask of sentiment analysis, which aims to extract the aspects and predict their sentiments. Most existing studies focus on improving the performance of the target domain by fine-tuning domain-specific models (trained on source domains) based on the target domain dataset. Few works propose continual learning tasks for ABSA, which aim to learn the target domain's ability while maintaining the history domains' abilities. In this paper, we propose a Large Language Model-based Continual Learning (\texttt{LLM-CL}) model for ABSA. First, we design a domain knowledge decoupling module to learn a domain-invariant adapter and separate domain-variant adapters dependently with an orthogonal constraint. Then, we introduce a domain knowledge warmup strategy to align the representation between domain-invariant and domain-variant knowledge. In the test phase, we index the corresponding domain-variant knowledge via domain positioning to not require each sample's domain ID. Extensive experiments over 19 datasets indicate that our \texttt{LLM-CL} model obtains new state-of-the-art performance.
