Table of Contents
Fetching ...

NUS-Emo at SemEval-2024 Task 3: Instruction-Tuning LLM for Multimodal Emotion-Cause Analysis in Conversations

Meng Luo, Han Zhang, Shengqiong Wu, Bobo Li, Hong Han, Hao Fei

TL;DR

This work tackles Multimodal Emotion-Cause Pair Extraction with Emotion Category (MECPE-Cat) for SemEval-2024 Task 3 by building an LLM-based framework that combines emotion recognition in conversation (ERC) with emotion-cause pair extraction (ECPE). The authors select ChatGLM via a pilot study and enhance it through emotion-cause-aware instruction-tuning using LoRA, alongside multimodal encoding with ImageBind and supplemental video descriptions from GPT-4V. The system decomposes MECPE-Cat into ERC and ECPE stages, iteratively refining training data by reinserting predictions and leveraging emotion labels to boost ECPE accuracy, achieving a weighted F1 of 0.3471 with multimodal inputs. The approach secures 2nd place on the MECPE-Cat leaderboard and is accompanied by code and resources to aid reproducibility and further research.

Abstract

This paper describes the architecture of our system developed for Task 3 of SemEval-2024: Multimodal Emotion-Cause Analysis in Conversations. Our project targets the challenges of subtask 2, dedicated to Multimodal Emotion-Cause Pair Extraction with Emotion Category (MECPE-Cat), and constructs a dual-component system tailored to the unique challenges of this task. We divide the task into two subtasks: emotion recognition in conversation (ERC) and emotion-cause pair extraction (ECPE). To address these subtasks, we capitalize on the abilities of Large Language Models (LLMs), which have consistently demonstrated state-of-the-art performance across various natural language processing tasks and domains. Most importantly, we design an approach of emotion-cause-aware instruction-tuning for LLMs, to enhance the perception of the emotions with their corresponding causal rationales. Our method enables us to adeptly navigate the complexities of MECPE-Cat, achieving a weighted average 34.71% F1 score of the task, and securing the 2nd rank on the leaderboard. The code and metadata to reproduce our experiments are all made publicly available.

NUS-Emo at SemEval-2024 Task 3: Instruction-Tuning LLM for Multimodal Emotion-Cause Analysis in Conversations

TL;DR

This work tackles Multimodal Emotion-Cause Pair Extraction with Emotion Category (MECPE-Cat) for SemEval-2024 Task 3 by building an LLM-based framework that combines emotion recognition in conversation (ERC) with emotion-cause pair extraction (ECPE). The authors select ChatGLM via a pilot study and enhance it through emotion-cause-aware instruction-tuning using LoRA, alongside multimodal encoding with ImageBind and supplemental video descriptions from GPT-4V. The system decomposes MECPE-Cat into ERC and ECPE stages, iteratively refining training data by reinserting predictions and leveraging emotion labels to boost ECPE accuracy, achieving a weighted F1 of 0.3471 with multimodal inputs. The approach secures 2nd place on the MECPE-Cat leaderboard and is accompanied by code and resources to aid reproducibility and further research.

Abstract

This paper describes the architecture of our system developed for Task 3 of SemEval-2024: Multimodal Emotion-Cause Analysis in Conversations. Our project targets the challenges of subtask 2, dedicated to Multimodal Emotion-Cause Pair Extraction with Emotion Category (MECPE-Cat), and constructs a dual-component system tailored to the unique challenges of this task. We divide the task into two subtasks: emotion recognition in conversation (ERC) and emotion-cause pair extraction (ECPE). To address these subtasks, we capitalize on the abilities of Large Language Models (LLMs), which have consistently demonstrated state-of-the-art performance across various natural language processing tasks and domains. Most importantly, we design an approach of emotion-cause-aware instruction-tuning for LLMs, to enhance the perception of the emotions with their corresponding causal rationales. Our method enables us to adeptly navigate the complexities of MECPE-Cat, achieving a weighted average 34.71% F1 score of the task, and securing the 2nd rank on the leaderboard. The code and metadata to reproduce our experiments are all made publicly available.

Paper Structure

This paper contains 16 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: An example of an official task and annotated dataset. Each arc points from the cause utterance to the emotional triggers. The cause spans have been highlighted in yellow. Background: Chandler and his girlfriend Monica walked into the casino (they had a quarrel earlier but made up soon), and then started a conversation with Phoebe.
  • Figure 2: Zero-shot test set performance of various instruction-tuned LLMs.
  • Figure 3: Proposed method workflow for the MECPE-Cat task.
  • Figure 4: The construction of the instruction template and the flow of model input and output.