Generative Technology for Human Emotion Recognition: A Scope Review
Fei Ma, Yucheng Yuan, Yifan Xie, Hongwei Ren, Ivan Liu, Ying He, Fuji Ren, Fei Richard Yu, Shiguang Ni
TL;DR
This work addresses the problem of limited understanding at the intersection of generative modeling and emotion recognition. It surveys a broad landscape of generative techniques—autoencoders, GANs, diffusion models, and large language models—across SER, FER, TER, physiological signals, and MER, organizing findings into a taxonomy focused on data augmentation, feature extraction, semi-supervised learning, cross-domain transfer, and adversarial robustness. The paper makes four core contributions: (i) first systematic review of generative tech for emotion recognition, (ii) analysis of 320+ papers with modality-aware taxonomy and dataset benchmarking, (iii) synthesis of practical insights and performance trends, and (iv) forward-looking guidance on combining diffusion models with transformers, RL/FL integration, VR/AR applications, and content synthesis. The findings highlight that FER currently benefits most from generative methods, while DM and LLMs are emerging, with cross-modal fusion and privacy considerations as central practical concerns for real-world deployment.
Abstract
Affective computing stands at the forefront of artificial intelligence (AI), seeking to imbue machines with the ability to comprehend and respond to human emotions. Central to this field is emotion recognition, which endeavors to identify and interpret human emotional states from different modalities, such as speech, facial images, text, and physiological signals. In recent years, important progress has been made in generative models, including Autoencoder, Generative Adversarial Network, Diffusion Model, and Large Language Model. These models, with their powerful data generation capabilities, emerge as pivotal tools in advancing emotion recognition. However, up to now, there remains a paucity of systematic efforts that review generative technology for emotion recognition. This survey aims to bridge the gaps in the existing literature by conducting a comprehensive analysis of over 320 research papers until June 2024. Specifically, this survey will firstly introduce the mathematical principles of different generative models and the commonly used datasets. Subsequently, through a taxonomy, it will provide an in-depth analysis of how generative techniques address emotion recognition based on different modalities in several aspects, including data augmentation, feature extraction, semi-supervised learning, cross-domain, etc. Finally, the review will outline future research directions, emphasizing the potential of generative models to advance the field of emotion recognition and enhance the emotional intelligence of AI systems.
