Table of Contents
Fetching ...

Deep Manifold Traversal: Changing Labels with Convolutional Features

Jacob R. Gardner, Paul Upchurch, Matt J. Kusner, Yixuan Li, Kilian Q. Weinberger, Kavita Bala, John E. Hopcroft

TL;DR

This paper tackles the problem of general label-changing in images by introducing Deep Manifold Traversal (DMT), a data-driven method that traverses the natural image manifold in a deep CNN feature space guided by Maximum Mean Discrepancy. The approach maps images to high-level features, performs a budgeted linear traversal toward a target class via an MMD-based objective, and reconstructs the resulting image from modified features. It demonstrates versatile semantic edits—aging, hair color changes, and outdoor scene transformations—at high resolutions and compares favorably to baselines and single-target morphing methods. The work highlights a scalable, task-agnostic framework with potential as a powerful data augmentation and pre-processing tool for vision systems.

Abstract

Many tasks in computer vision can be cast as a "label changing" problem, where the goal is to make a semantic change to the appearance of an image or some subject in an image in order to alter the class membership. Although successful task-specific methods have been developed for some label changing applications, to date no general purpose method exists. Motivated by this we propose deep manifold traversal, a method that addresses the problem in its most general form: it first approximates the manifold of natural images then morphs a test image along a traversal path away from a source class and towards a target class while staying near the manifold throughout. The resulting algorithm is surprisingly effective and versatile. It is completely data driven, requiring only an example set of images from the desired source and target domains. We demonstrate deep manifold traversal on highly diverse label changing tasks: changing an individual's appearance (age and hair color), changing the season of an outdoor image, and transforming a city skyline towards nighttime.

Deep Manifold Traversal: Changing Labels with Convolutional Features

TL;DR

This paper tackles the problem of general label-changing in images by introducing Deep Manifold Traversal (DMT), a data-driven method that traverses the natural image manifold in a deep CNN feature space guided by Maximum Mean Discrepancy. The approach maps images to high-level features, performs a budgeted linear traversal toward a target class via an MMD-based objective, and reconstructs the resulting image from modified features. It demonstrates versatile semantic edits—aging, hair color changes, and outdoor scene transformations—at high resolutions and compares favorably to baselines and single-target morphing methods. The work highlights a scalable, task-agnostic framework with potential as a powerful data augmentation and pre-processing tool for vision systems.

Abstract

Many tasks in computer vision can be cast as a "label changing" problem, where the goal is to make a semantic change to the appearance of an image or some subject in an image in order to alter the class membership. Although successful task-specific methods have been developed for some label changing applications, to date no general purpose method exists. Motivated by this we propose deep manifold traversal, a method that addresses the problem in its most general form: it first approximates the manifold of natural images then morphs a test image along a traversal path away from a source class and towards a target class while staying near the manifold throughout. The resulting algorithm is surprisingly effective and versatile. It is completely data driven, requiring only an example set of images from the desired source and target domains. We demonstrate deep manifold traversal on highly diverse label changing tasks: changing an individual's appearance (age and hair color), changing the season of an outdoor image, and transforming a city skyline towards nighttime.

Paper Structure

This paper contains 19 sections, 11 equations, 8 figures.

Figures (8)

  • Figure 1: Top: Input image $\bar{\mathbf{x}}^{s}$ is transformed by a ConvNet to deep features (orange). Middle: The manifold is traversed (black arrow) from source, $\bar{\mathbf{z}}^{s}$, to target, $\bar{\mathbf{z}}^{t}$, in feature space. Bottom:$\bar{\mathbf{z}}^{t}$ is inverted to recover $\bar{\mathbf{x}}^{t}$, subject to total variation regularizer $R_{V^{\beta}}$.
  • Figure 2: (Zoom in for details.) Face aging via manifold traversal on random (except Aaron Eckhart) 250x250 test images from LFW. All aging results shown were run with the same value of $\lambda$.
  • Figure 3: (Zoom in for details.) Several methods used to change the age of an input image of Harrison Ford.
  • Figure 4: (Zoom in for details.)Left: "Aging" images generated using the method of szegedy2013intriguing. Right: "Aging" images generated by deep manifold traversal. The image progression towards the right was generated by gradually decreasing the value of $\lambda$. Numbers below each image show the Platt scaled probabilities of an SVM trained on VGG features to distinguish old age, where lower values indicate more "senior".
  • Figure 5: (Zoom in for details.) Changing hair color of random (except Aaron Eckhart) 250x250 images from LFW with manifold traversal. Top. Manifold traversal to lighter hair. Middle. Original image. Bottom. Manifold traversal to darker hair. All traversals were performed with the same value of lambda.
  • ...and 3 more figures