Table of Contents
Fetching ...

Retargeting Visual Data with Deformation Fields

Tim Elsner, Julia Berger, Tong Wu, Victor Czech, Lin Gao, Leif Kobbelt

TL;DR

The paper reframes content-aware retargeting as a global, continuous deformation problem learned by neural fields, extending seam carving beyond images to 3D data such as Neural Radiance Fields (NeRFs) and polygon meshes. A scalar deformation field D(p) along a fixed axis v induces a coordinate shift p' = p + D(p)·v, with I(p) = I'(p') providing the retargeted content. It combines energy-based content awareness with sanity constraints via losses L_e, L_s, L_b, and L_m, plus domain-specific energy and cumulative-energy networks E and Σ, enabling the deformation to concentrate in low-information regions while preserving high-information detail; an inverse deformation U facilitates NeRF surface manipulation. The framework yields better content-aware retargeting than prior seam-carving methods, demonstrated through quantitative metrics (e.g., FID) and user studies across images and 3D scenes, and supports editing operations such as object removal and movement. Overall, the approach offers a domain-agnostic backbone for seam-carving-like retargeting that generalizes to 3D representations with controllable distortion and plausible outputs, albeit with computational trade-offs relative to traditional seam carving.

Abstract

Seam carving is an image editing method that enable content-aware resizing, including operations like removing objects. However, the seam-finding strategy based on dynamic programming or graph-cut limits its applications to broader visual data formats and degrees of freedom for editing. Our observation is that describing the editing and retargeting of images more generally by a displacement field yields a generalisation of content-aware deformations. We propose to learn a deformation with a neural network that keeps the output plausible while trying to deform it only in places with low information content. This technique applies to different kinds of visual data, including images, 3D scenes given as neural radiance fields, or even polygon meshes. Experiments conducted on different visual data show that our method achieves better content-aware retargeting compared to previous methods.

Retargeting Visual Data with Deformation Fields

TL;DR

The paper reframes content-aware retargeting as a global, continuous deformation problem learned by neural fields, extending seam carving beyond images to 3D data such as Neural Radiance Fields (NeRFs) and polygon meshes. A scalar deformation field D(p) along a fixed axis v induces a coordinate shift p' = p + D(p)·v, with I(p) = I'(p') providing the retargeted content. It combines energy-based content awareness with sanity constraints via losses L_e, L_s, L_b, and L_m, plus domain-specific energy and cumulative-energy networks E and Σ, enabling the deformation to concentrate in low-information regions while preserving high-information detail; an inverse deformation U facilitates NeRF surface manipulation. The framework yields better content-aware retargeting than prior seam-carving methods, demonstrated through quantitative metrics (e.g., FID) and user studies across images and 3D scenes, and supports editing operations such as object removal and movement. Overall, the approach offers a domain-agnostic backbone for seam-carving-like retargeting that generalizes to 3D representations with controllable distortion and plausible outputs, albeit with computational trade-offs relative to traditional seam carving.

Abstract

Seam carving is an image editing method that enable content-aware resizing, including operations like removing objects. However, the seam-finding strategy based on dynamic programming or graph-cut limits its applications to broader visual data formats and degrees of freedom for editing. Our observation is that describing the editing and retargeting of images more generally by a displacement field yields a generalisation of content-aware deformations. We propose to learn a deformation with a neural network that keeps the output plausible while trying to deform it only in places with low information content. This technique applies to different kinds of visual data, including images, 3D scenes given as neural radiance fields, or even polygon meshes. Experiments conducted on different visual data show that our method achieves better content-aware retargeting compared to previous methods.
Paper Structure (33 sections, 19 equations, 18 figures, 10 tables)

This paper contains 33 sections, 19 equations, 18 figures, 10 tables.

Figures (18)

  • Figure 1: Different objectives applied to different visual domains with our approach: We demonstrate retargeting images, NeRFs, and meshes. 'Surfer' image from retargetme.
  • Figure 2: The family of image resizing methods. Our approach formulates the problem in a more general way, but can utilise e.g. new energy formulations as well.
  • Figure 3: Our proposed pipeline for retargeting visual data: For a given input (left), we train two simple networks that learn the energy and cumulative energy along the deformation axis of the input (centre-left). We initialise a network that stretches samples to the desired position (centre-right), then optimise this deformation to distribute the distortion to low information content regions (right). Balloon image from retargetme.
  • Figure 4: Our pipeline applied to neural radiance fields: We first obtain a point cloud, then estimate the energy values. We then learn a continuous energy function and a cumulative energy function that we use to optimise our deformation network. The resulting deformation field is visualised on the bottom right.
  • Figure 4: User study measuring preference, pairwise comparing to our approach using salience maps of srinivas2019full.
  • ...and 13 more figures