Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition

Bingbing Wang; Bin Liang; Chun-Mei Feng; Wangmeng Zuo; Zhixin Bai; Shijue Huang; Kam-Fai Wong; Xi Zeng; Ruifeng Xu

Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition

Bingbing Wang, Bin Liang, Chun-Mei Feng, Wangmeng Zuo, Zhixin Bai, Shijue Huang, Kam-Fai Wong, Xi Zeng, Ruifeng Xu

TL;DR

StickerTAG is introduced, the first multi-tag sticker dataset comprising a collected tag set with 461 tags and 13,571 sticker-tag pairs, designed to provide a deeper understanding of stickers.

Abstract

In real-world conversations, the diversity and ambiguity of stickers often lead to varied interpretations based on the context, necessitating the requirement for comprehensively understanding stickers and supporting multi-tagging. To address this challenge, we introduce StickerTAG, the first multi-tag sticker dataset comprising a collected tag set with 461 tags and 13,571 sticker-tag pairs, designed to provide a deeper understanding of stickers. Recognizing multiple tags for stickers becomes particularly challenging due to sticker tags usually are fine-grained attribute aware. Hence, we propose an Attentive Attribute-oriented Prompt Learning method, ie, Att$^2$PL, to capture informative features of stickers in a fine-grained manner to better differentiate tags. Specifically, we first apply an Attribute-oriented Description Generation (ADG) module to obtain the description for stickers from four attributes. Then, a Local Re-attention (LoR) module is designed to perceive the importance of local information. Finally, we use prompt learning to guide the recognition process and adopt confidence penalty optimization to penalize the confident output distribution. Extensive experiments show that our method achieves encouraging results for all commonly used metrics.

Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition

TL;DR

StickerTAG is introduced, the first multi-tag sticker dataset comprising a collected tag set with 461 tags and 13,571 sticker-tag pairs, designed to provide a deeper understanding of stickers.

Abstract

PL, to capture informative features of stickers in a fine-grained manner to better differentiate tags. Specifically, we first apply an Attribute-oriented Description Generation (ADG) module to obtain the description for stickers from four attributes. Then, a Local Re-attention (LoR) module is designed to perceive the importance of local information. Finally, we use prompt learning to guide the recognition process and adopt confidence penalty optimization to penalize the confident output distribution. Extensive experiments show that our method achieves encouraging results for all commonly used metrics.

Paper Structure (22 sections, 4 equations, 5 figures, 3 tables)

This paper contains 22 sections, 4 equations, 5 figures, 3 tables.

Introduction
Related Work
Sticker Dataset
Sticker-based Method
StickerTAG Dataset
Tag Construction
Sticker Collection and Annotation
Characteristics
Method
Task Definition
Attribute-oriented Description Generation
Local Re-attention Module
Prompt-based Classification
Confidence Penalty Optimization
Experiment
...and 7 more sections

Figures (5)

Figure 1: Examples of stickers along with multiple tags.
Figure 2: (a) Word cloud distribution of the sticker tags. Larger text size indicates a higher frequency of occurrence. (b) Number of samples per tag, highlighted by an orange trend line.
Figure 3: Illustration of the proposed Att$^2$PL method comprising (1) Attribute-oriented Description Generation, (2) Local Re-attention Module, (3) Prompt-based Classification, and (4) Confidence Penalty Optimization (blue lines).
Figure 4: Overview of attribute-oriented description generation.
Figure 5: Examples of stickers with ground truth tags and the predicted tags inferred by our Att$^2$PL framework.

Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition

TL;DR

Abstract

Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (5)