Table of Contents
Fetching ...

MeInTime: Bridging Age Gap in Identity-Preserving Face Restoration

Teer Song, Yue Zhang, Yu Tian, Ziyang Wang, Xianlin Zhang, Guixuan Zhang, Xuan Liu, Xueming Li, Yasen Zhang

Abstract

To better preserve an individual's identity, face restoration has evolved from reference-free to reference-based approaches, which leverage high-quality reference images of the same identity to enhance identity fidelity in the restored outputs. However, most existing methods implicitly assume that the reference and degraded input are age-aligned, limiting their effectiveness in real-world scenarios where only cross-age references are available, such as historical photo restoration. This paper proposes MeInTime, a diffusion-based face restoration method that extends reference-based restoration from same-age to cross-age settings. Given one or few reference images along with an age prompt corresponding to the degraded input, MeInTime achieves faithful restoration with both identity fidelity and age consistency. Specifically, we decouple the modeling of identity and age conditions. During training, we focus solely on effectively injecting identity features through a newly introduced attention mechanism and introduce Gated Residual Fusion modules to facilitate the integration between degraded features and identity representations. At inference, we propose Age-Aware Gradient Guidance, a training-free sampling strategy, using an age-driven direction to iteratively nudge the identity-aware denoising latent toward the desired age semantic manifold. Extensive experiments demonstrate that MeInTime outperforms existing face restoration methods in both identity preservation and age consistency. Our code is available at: https://github.com/teer4/MeInTime

MeInTime: Bridging Age Gap in Identity-Preserving Face Restoration

Abstract

To better preserve an individual's identity, face restoration has evolved from reference-free to reference-based approaches, which leverage high-quality reference images of the same identity to enhance identity fidelity in the restored outputs. However, most existing methods implicitly assume that the reference and degraded input are age-aligned, limiting their effectiveness in real-world scenarios where only cross-age references are available, such as historical photo restoration. This paper proposes MeInTime, a diffusion-based face restoration method that extends reference-based restoration from same-age to cross-age settings. Given one or few reference images along with an age prompt corresponding to the degraded input, MeInTime achieves faithful restoration with both identity fidelity and age consistency. Specifically, we decouple the modeling of identity and age conditions. During training, we focus solely on effectively injecting identity features through a newly introduced attention mechanism and introduce Gated Residual Fusion modules to facilitate the integration between degraded features and identity representations. At inference, we propose Age-Aware Gradient Guidance, a training-free sampling strategy, using an age-driven direction to iteratively nudge the identity-aware denoising latent toward the desired age semantic manifold. Extensive experiments demonstrate that MeInTime outperforms existing face restoration methods in both identity preservation and age consistency. Our code is available at: https://github.com/teer4/MeInTime
Paper Structure (25 sections, 17 equations, 22 figures, 7 tables, 1 algorithm)

This paper contains 25 sections, 17 equations, 22 figures, 7 tables, 1 algorithm.

Figures (22)

  • Figure 1: Overview of MeInTime. (a) During training, identity features from reference images are extracted by Insightface arcface, projected into face embeddings, and injected into UNet via decoupled cross-attention with a unified prompt "photo of a person". Gated Residual Fusion (GRF) modules are introduced into each decoder block to facilitate feature fusion. The degraded input is processed following DiffBIR, which incorporates a restoration model swinir and the proposed IRControlNet as structural guidance. We adopt the standard diffusion loss ($\mathcal{L}_{\text{Diff}}$) as the training objective. (b) During inference, given the target age, the framework generates an age prompt (e.g., "photo of a 24-year-old person") and performs two forward passes--with and without the age condition--to compute an age-aware gradient that iteratively refines the latent along the denoising process, enabling identity-preserving, age-controllable restoration.
  • Figure 2: The structure of Gated Residual Fusion module.
  • Figure 2: Ablation study on varying age gaps.
  • Figure 3: Visual comparison of identity-preserving restoration under different age control strategies.
  • Figure 4: Visual comparison of different optimization steps.
  • ...and 17 more figures