InfoAffect: A Dataset for Affective Analysis of Infographics
Zihang Fu, Yunchao Wang, Chenyu Huang, Guodao Sun, Ronghua Liang
TL;DR
InfoAffect addresses the lack of affective analysis resources for infographics by introducing a 3.5k-sample dual-modal dataset that pairs real-world infographics with accompanying text and constrains affect labels via an Affect Table. The pipeline collects data from six domains, applies quality controls, and uses five multimodal large language models to extract affects, fused with Reciprocal Rank Fusion. Two user studies validate usability and accuracy, reporting a Composite Affect Consistency Index of 0.986, indicating strong alignment with human judgments. The resource supports robust affective modeling of infographics and can facilitate affect-aware design and downstream fine-tuning of multimodal models.
Abstract
Infographics are widely used to convey complex information, yet their affective dimensions remain underexplored due to the scarcity of data resources. We introduce a 3.5k-sample affect-annotated InfoAffect dataset, which combines textual content with real-world infographics. We first collect the raw data from six domains and aligned them via preprocessing, the accompanied-text-priority method, and three strategies to guarantee the quality and compliance. After that we construct an affect table and use it to constrain annotation. Five state-of-the-art multimodal large language models (MLLMs) then analyze both modalities, and their outputs are fused with Reciprocal Rank Fusion (RRF) algorithm to yield robust affects and confidences. We conducted a user study with two experiments to validate usability and assess InfoAffect dataset using the Composite Affect Consistency Index (CACI), achieving an overall score of 0.986, which indicates high accuracy.
