Table of Contents
Fetching ...

Face to Cartoon Incremental Super-Resolution using Knowledge Distillation

Trinetra Devkatte, Shiv Ram Dubey, Satish Kumar Singh, Abdenour Hadid

TL;DR

The paper tackles the problem of adapting GAN-based facial super-resolution models to unseen cross-domain data by introducing ISR-KD, an incremental learning framework that uses knowledge distillation to retain source-domain performance while expanding to cartoon faces. A pre-trained CelebA SR generator is incrementally trained on iCartoonFace data, with a frozen teacher G_S guiding a student G_T via output and bottleneck losses to prevent forgetting. The approach combines an edge-enhanced generator/discriminator architecture and a multi-term objective (adversarial, edge, luminance-chrominance, identity, and reconstruction losses) to achieve improved cartoon SR without sacrificing real-face SR, as demonstrated on CelebA and iCartoonFace with several dataset splits and ablations. Cross-dataset and extended-network analyses show the method’s robustness and the benefit of deeper incremental components, suggesting practical applicability for dynamically evolving facial data distributions. The work offers a practical pathway for deploying SR systems in real-world pipelines where new data types (e.g., cartoons) emerge over time while preserving prior capabilities.

Abstract

Facial super-resolution/hallucination is an important area of research that seeks to enhance low-resolution facial images for a variety of applications. While Generative Adversarial Networks (GANs) have shown promise in this area, their ability to adapt to new, unseen data remains a challenge. This paper addresses this problem by proposing an incremental super-resolution using GANs with knowledge distillation (ISR-KD) for face to cartoon. Previous research in this area has not investigated incremental learning, which is critical for real-world applications where new data is continually being generated. The proposed ISR-KD aims to develop a novel unified framework for facial super-resolution that can handle different settings, including different types of faces such as cartoon face and various levels of detail. To achieve this, a GAN-based super-resolution network was pre-trained on the CelebA dataset and then incrementally trained on the iCartoonFace dataset, using knowledge distillation to retain performance on the CelebA test set while improving the performance on iCartoonFace test set. Our experiments demonstrate the effectiveness of knowledge distillation in incrementally adding capability to the model for cartoon face super-resolution while retaining the learned knowledge for facial hallucination tasks in GANs.

Face to Cartoon Incremental Super-Resolution using Knowledge Distillation

TL;DR

The paper tackles the problem of adapting GAN-based facial super-resolution models to unseen cross-domain data by introducing ISR-KD, an incremental learning framework that uses knowledge distillation to retain source-domain performance while expanding to cartoon faces. A pre-trained CelebA SR generator is incrementally trained on iCartoonFace data, with a frozen teacher G_S guiding a student G_T via output and bottleneck losses to prevent forgetting. The approach combines an edge-enhanced generator/discriminator architecture and a multi-term objective (adversarial, edge, luminance-chrominance, identity, and reconstruction losses) to achieve improved cartoon SR without sacrificing real-face SR, as demonstrated on CelebA and iCartoonFace with several dataset splits and ablations. Cross-dataset and extended-network analyses show the method’s robustness and the benefit of deeper incremental components, suggesting practical applicability for dynamically evolving facial data distributions. The work offers a practical pathway for deploying SR systems in real-world pipelines where new data types (e.g., cartoons) emerge over time while preserving prior capabilities.

Abstract

Facial super-resolution/hallucination is an important area of research that seeks to enhance low-resolution facial images for a variety of applications. While Generative Adversarial Networks (GANs) have shown promise in this area, their ability to adapt to new, unseen data remains a challenge. This paper addresses this problem by proposing an incremental super-resolution using GANs with knowledge distillation (ISR-KD) for face to cartoon. Previous research in this area has not investigated incremental learning, which is critical for real-world applications where new data is continually being generated. The proposed ISR-KD aims to develop a novel unified framework for facial super-resolution that can handle different settings, including different types of faces such as cartoon face and various levels of detail. To achieve this, a GAN-based super-resolution network was pre-trained on the CelebA dataset and then incrementally trained on the iCartoonFace dataset, using knowledge distillation to retain performance on the CelebA test set while improving the performance on iCartoonFace test set. Our experiments demonstrate the effectiveness of knowledge distillation in incrementally adding capability to the model for cartoon face super-resolution while retaining the learned knowledge for facial hallucination tasks in GANs.
Paper Structure (25 sections, 10 equations, 5 figures, 5 tables)

This paper contains 25 sections, 10 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Proposed face to cartoon incremental super-resolution method using knowledge distillation. Conv, ReLU, and T_conv represent Convolution Layer, ReLU Layer, and Transpose Convolution Layer, respectively. Pre-trained FSR Generator is trained on CelebA Dataset. The incremental FSR Generator is initialized with weights of Pre-trained FSR Generator and trained on combined CelebA and iCartoonFace images using the proposed method.
  • Figure 2: Edge block having an edge extraction layer. Here, $B$ is the batch size, $H$ is the height of tensor, $W$ is the width of tensor, $c$ is the number of channels in tensor, $r$ is the scaling factor and $s$ is the stride variable.
  • Figure 3: A schematic diagram of Discriminator architecture. Here, $s$ indicates the stride and the list of numbers adjacent to $s$ indicates the stride of convolution layers grouped with the same number of output channels.
  • Figure 4: The generated samples depicting the visual effects of using incremental learning in combination with knowledge distillation for facial super-resolution task. The left half of the image contains the results for CelebA dataset (Source Domain). The right half shows the results after incrementally training on the iCartoonFace dataset (Target Domain).
  • Figure 5: The additional visual results. The generated samples depicting the visual effects of using incremental learning in combination with knowledge distillation for facial super-resolution task. The left half of the image contains the results for CelebA dataset (Source Domain). The right half shows the results after incrementally training on the iCartoonFace dataset (Target Domain).