Keyword-Oriented Multimodal Modeling for Euphemism Identification
Yuxue Hu, Junsong Li, Meixuan Chen, Dongyu Su, Tongguan Wang, Ying Sha
TL;DR
This paper tackles euphemism identification by moving beyond text-only approaches to a keyword-oriented multimodal framework. It introduces KOM-Euph, the first large-scale text-image-speech euphemism corpus across Drug, Weapon, and Sexuality domains, and KOM-EI, a model that aligns and fuses textual, visual, and audio signals through cross-modal contrastive learning, cross-attention, and gating. The method achieves state-of-the-art performance and superior efficiency compared with large language models, demonstrating the value of multimodal data for disambiguating euphemisms in dangerous or illicit content. This work advances content moderation capabilities and provides a foundation for more robust analyses of evolving euphemisms in multimedia contexts.
Abstract
Euphemism identification deciphers the true meaning of euphemisms, such as linking "weed" (euphemism) to "marijuana" (target keyword) in illicit texts, aiding content moderation and combating underground markets. While existing methods are primarily text-based, the rise of social media highlights the need for multimodal analysis, incorporating text, images, and audio. However, the lack of multimodal datasets for euphemisms limits further research. To address this, we regard euphemisms and their corresponding target keywords as keywords and first introduce a keyword-oriented multimodal corpus of euphemisms (KOM-Euph), involving three datasets (Drug, Weapon, and Sexuality), including text, images, and speech. We further propose a keyword-oriented multimodal euphemism identification method (KOM-EI), which uses cross-modal feature alignment and dynamic fusion modules to explicitly utilize the visual and audio features of the keywords for efficient euphemism identification. Extensive experiments demonstrate that KOM-EI outperforms state-of-the-art models and large language models, and show the importance of our multimodal datasets.
