Table of Contents
Fetching ...

A Comparative Study of Transfer Learning for Emotion Recognition using CNN and Modified VGG16 Models

Samay Nathani

TL;DR

The paper tackles cross-dataset generalization in emotion recognition by comparing a CNN-based VGG16 and a modified, deeper VGG16 under transfer learning from FER2013 to AffectNet. The modified architecture adds convolutional depth, larger fully connected layers, and regularization, with fine-tuning guided by a ReduceLROnPlateau scheduler and evaluated via accuracy, precision, recall, F1, and predictive entropy. Results show the Modified VGG16 yields modest gains on FER2013 and maintains an edge on AffectNet, though overall performance declines on the more diverse AffectNet dataset, highlighting generalization challenges. The work emphasizes dataset diversity and suggests future directions toward multi-modal approaches and richer datasets to enhance robustness in emotion-recognition systems.

Abstract

Emotion recognition is a critical aspect of human interaction. This topic garnered significant attention in the field of artificial intelligence. In this study, we investigate the performance of convolutional neural network (CNN) and Modified VGG16 models for emotion recognition tasks across two datasets: FER2013 and AffectNet. Our aim is to measure the effectiveness of these models in identifying emotions and their ability to generalize to different and broader datasets. Our findings reveal that both models achieve reasonable performance on the FER2013 dataset, with the Modified VGG16 model demonstrating slightly increased accuracy. When evaluated on the Affect-Net dataset, performance declines for both models, with the Modified VGG16 model continuing to outperform the CNN. Our study emphasizes the importance of dataset diversity in emotion recognition and discusses open problems and future research directions, including the exploration of multi-modal approaches and the development of more comprehensive datasets.

A Comparative Study of Transfer Learning for Emotion Recognition using CNN and Modified VGG16 Models

TL;DR

The paper tackles cross-dataset generalization in emotion recognition by comparing a CNN-based VGG16 and a modified, deeper VGG16 under transfer learning from FER2013 to AffectNet. The modified architecture adds convolutional depth, larger fully connected layers, and regularization, with fine-tuning guided by a ReduceLROnPlateau scheduler and evaluated via accuracy, precision, recall, F1, and predictive entropy. Results show the Modified VGG16 yields modest gains on FER2013 and maintains an edge on AffectNet, though overall performance declines on the more diverse AffectNet dataset, highlighting generalization challenges. The work emphasizes dataset diversity and suggests future directions toward multi-modal approaches and richer datasets to enhance robustness in emotion-recognition systems.

Abstract

Emotion recognition is a critical aspect of human interaction. This topic garnered significant attention in the field of artificial intelligence. In this study, we investigate the performance of convolutional neural network (CNN) and Modified VGG16 models for emotion recognition tasks across two datasets: FER2013 and AffectNet. Our aim is to measure the effectiveness of these models in identifying emotions and their ability to generalize to different and broader datasets. Our findings reveal that both models achieve reasonable performance on the FER2013 dataset, with the Modified VGG16 model demonstrating slightly increased accuracy. When evaluated on the Affect-Net dataset, performance declines for both models, with the Modified VGG16 model continuing to outperform the CNN. Our study emphasizes the importance of dataset diversity in emotion recognition and discusses open problems and future research directions, including the exploration of multi-modal approaches and the development of more comprehensive datasets.
Paper Structure (4 sections, 4 figures, 5 tables)

This paper contains 4 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: CNN Accuracy on the FER2013 dataset
  • Figure 2: CNN Loss on the FER2013 dataset
  • Figure 3: Modified VGG16 Loss on the FER2013 dataset
  • Figure 4: Modified VGG16 Accuracy on the FER2013 dataset