Table of Contents
Fetching ...

SelfAge: Personalized Facial Age Transformation Using Self-reference Images

Taishi Ito, Yuki Endo, Yoshihiro Kanamori

TL;DR

SelfAge presents the first diffusion-model-based approach for personalized facial age transformation that preserves identity by leveraging 3–5 self-reference images of the same person at different ages. The method fine-tunes a pretrained latent diffusion model with LoRA, learns age dynamics from a refined regularization set with integer ages, and employs Null-text Inversion plus Prompt-to-Prompt with carefully designed prompts that encode both identity tokens and precise ages. Key contributions include integer-age supervision via a re-labeled CelebA-Dialog set, identity-preserving adaptation through a learned identity token, and targeted prompt design (including $\alpha$-year-old representations and extreme-age token replacements). Experimental results show competitive age-editing accuracy with strong identity preservation, outperforming several baselines in identity fidelity and offering robust ablation-supported gains from regularization refinement, LoRA, and prompt design. Overall, SelfAge enables realistic, personalized age edits for existing images, enabling precise age progression/regression that respects an individual's life history and appearance.

Abstract

Age transformation of facial images is a technique that edits age-related person's appearances while preserving the identity. Existing deep learning-based methods can reproduce natural age transformations; however, they only reproduce averaged transitions and fail to account for individual-specific appearances influenced by their life histories. In this paper, we propose the first diffusion model-based method for personalized age transformation. Our diffusion model takes a facial image and a target age as input and generates an age-edited face image as output. To reflect individual-specific features, we incorporate additional supervision using self-reference images, which are facial images of the same person at different ages. Specifically, we fine-tune a pretrained diffusion model for personalized adaptation using approximately 3 to 5 self-reference images. Additionally, we design an effective prompt to enhance the performance of age editing and identity preservation. Experiments demonstrate that our method achieves superior performance both quantitatively and qualitatively compared to existing methods. The code and the pretrained model are available at https://github.com/shiiiijp/SelfAge.

SelfAge: Personalized Facial Age Transformation Using Self-reference Images

TL;DR

SelfAge presents the first diffusion-model-based approach for personalized facial age transformation that preserves identity by leveraging 3–5 self-reference images of the same person at different ages. The method fine-tunes a pretrained latent diffusion model with LoRA, learns age dynamics from a refined regularization set with integer ages, and employs Null-text Inversion plus Prompt-to-Prompt with carefully designed prompts that encode both identity tokens and precise ages. Key contributions include integer-age supervision via a re-labeled CelebA-Dialog set, identity-preserving adaptation through a learned identity token, and targeted prompt design (including -year-old representations and extreme-age token replacements). Experimental results show competitive age-editing accuracy with strong identity preservation, outperforming several baselines in identity fidelity and offering robust ablation-supported gains from regularization refinement, LoRA, and prompt design. Overall, SelfAge enables realistic, personalized age edits for existing images, enabling precise age progression/regression that respects an individual's life history and appearance.

Abstract

Age transformation of facial images is a technique that edits age-related person's appearances while preserving the identity. Existing deep learning-based methods can reproduce natural age transformations; however, they only reproduce averaged transitions and fail to account for individual-specific appearances influenced by their life histories. In this paper, we propose the first diffusion model-based method for personalized age transformation. Our diffusion model takes a facial image and a target age as input and generates an age-edited face image as output. To reflect individual-specific features, we incorporate additional supervision using self-reference images, which are facial images of the same person at different ages. Specifically, we fine-tune a pretrained diffusion model for personalized adaptation using approximately 3 to 5 self-reference images. Additionally, we design an effective prompt to enhance the performance of age editing and identity preservation. Experiments demonstrate that our method achieves superior performance both quantitatively and qualitatively compared to existing methods. The code and the pretrained model are available at https://github.com/shiiiijp/SelfAge.

Paper Structure

This paper contains 29 sections, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Our method applies personalized age transformation to the input facial image (top left) using a few (3-5) self-reference images (left). The number on each image in the left column is the age estimated by an age estimator.
  • Figure 2: Overview of our method. In the training phase, we fine-tune a pretrained diffusion model rombach2022high using a refined regularization set (see Section \ref{['sec:int_age']}) and self-reference images labeled with integer ages. We employ LoRA hu2021lora to avoid overfitting on these images (see Section \ref{['sec:lora']}). In the inference phase, from input image $x$, we first obtain a latent representation $x_T$ using Null-text Inversion mokady2023null and apply Prompt-to-prompt hertz2022prompt with original age $\alpha_\mathit{in}$ and target age $\alpha_\mathit{tar}$ to generate age-edited image $y$. We carefully design the text prompts $\mathcal{P}_\mathit{ref}$, $\mathcal{P}_\mathit{reg}$, $\mathcal{P}_\mathit{in}$, and $\mathcal{P}_\mathit{tar}$ for more accurate age transformation (see Section \ref{['sec:prompt_design']} and Table \ref{['tab:prompt']}).
  • Figure 3: Difference in cross-attention value replacement between our method and FADING chen2023face. Our method represents age information as "$\alpha$-year-old" and replaces cross-attention values corresponding to the person-describing noun (e.g., "man" or "boy") as well as the tokens "$\alpha$", "-", "year", "-", and "old". In contrast, FADING represents age information as "$\alpha$ year old" and replaces cross-attention values only for the person-describing noun and the token "$\alpha$".
  • Figure 4: Qualitative comparison between our method and the existing methods alaluf2021onlygomez2022customchen2023face. The upper right numbers on the input and self-reference images show the ages estimated by the age estimator.
  • Figure 5: Qualitative comparison of our method with and without LoRA hu2021lora.
  • ...and 4 more figures