Table of Contents
Fetching ...

MiraGe: Editable 2D Images using Gaussian Splatting

Joanna Waczyńska, Tomasz Szczepanik, Piotr Borycki, Sławomir Tadeja, Thomas Bohné, Przemysław Spurek

TL;DR

This work tackles editable 2D image editing by representing images with parameterized flat 3D Gaussians (triangle soup) rather than neural networks. It introduces MiraGe, which places Gaussians on the $XZ$ plane, supports three manipulation modes (Amorphous, 2D, Graphite), and uses a mirror-camera setup to constrain spatial regions and produce reliable 3D-like edits with a physics engine. Compared to GaussianImage and other INR baselines, MiraGe achieves state-of-the-art reconstruction quality on Kodak and DIV2K while enabling intuitive 3D-aware edits and animations. Limitations include non-generative behavior, potential artifacts when mis-editing, and higher parameter counts with longer training times, motivating future work such as inpainting and efficiency improvements. The approach has practical impact for 2.5D editing, animation, and AR/VR workflows by providing precise, controllable image modifications that blend 2D and 3D representations.

Abstract

Implicit Neural Representations (INRs) approximate discrete data through continuous functions and are commonly used for encoding 2D images. Traditional image-based INRs employ neural networks to map pixel coordinates to RGB values, capturing shapes, colors, and textures within the network's weights. Recently, GaussianImage has been proposed as an alternative, using Gaussian functions instead of neural networks to achieve comparable quality and compression. Such a solution obtains a quality and compression ratio similar to classical INR models but does not allow image modification. In contrast, our work introduces a novel method, MiraGe, which uses mirror reflections to perceive 2D images in 3D space and employs flat-controlled Gaussians for precise 2D image editing. Our approach improves the rendering quality and allows realistic image modifications, including human-inspired perception of photos in the 3D world. Thanks to modeling images in 3D space, we obtain the illusion of 3D-based modification in 2D images. We also show that our Gaussian representation can be easily combined with a physics engine to produce physics-based modification of 2D images. Consequently, MiraGe allows for better quality than the standard approach and natural modification of 2D images

MiraGe: Editable 2D Images using Gaussian Splatting

TL;DR

This work tackles editable 2D image editing by representing images with parameterized flat 3D Gaussians (triangle soup) rather than neural networks. It introduces MiraGe, which places Gaussians on the plane, supports three manipulation modes (Amorphous, 2D, Graphite), and uses a mirror-camera setup to constrain spatial regions and produce reliable 3D-like edits with a physics engine. Compared to GaussianImage and other INR baselines, MiraGe achieves state-of-the-art reconstruction quality on Kodak and DIV2K while enabling intuitive 3D-aware edits and animations. Limitations include non-generative behavior, potential artifacts when mis-editing, and higher parameter counts with longer training times, motivating future work such as inpainting and efficiency improvements. The approach has practical impact for 2.5D editing, animation, and AR/VR workflows by providing precise, controllable image modifications that blend 2D and 3D representations.

Abstract

Implicit Neural Representations (INRs) approximate discrete data through continuous functions and are commonly used for encoding 2D images. Traditional image-based INRs employ neural networks to map pixel coordinates to RGB values, capturing shapes, colors, and textures within the network's weights. Recently, GaussianImage has been proposed as an alternative, using Gaussian functions instead of neural networks to achieve comparable quality and compression. Such a solution obtains a quality and compression ratio similar to classical INR models but does not allow image modification. In contrast, our work introduces a novel method, MiraGe, which uses mirror reflections to perceive 2D images in 3D space and employs flat-controlled Gaussians for precise 2D image editing. Our approach improves the rendering quality and allows realistic image modifications, including human-inspired perception of photos in the 3D world. Thanks to modeling images in 3D space, we obtain the illusion of 3D-based modification in 2D images. We also show that our Gaussian representation can be easily combined with a physics engine to produce physics-based modification of 2D images. Consequently, MiraGe allows for better quality than the standard approach and natural modification of 2D images
Paper Structure (11 sections, 9 equations, 21 figures, 4 tables)

This paper contains 11 sections, 9 equations, 21 figures, 4 tables.

Figures (21)

  • Figure 1: MiraGe encodes 2D images with parameterized Gaussians, enabling high-quality reconstruction and real-life-like modifications. The selected parts of the image can be transformed in 3D space, creating a 3D effect with a physics engine controlling movement and interactions.
  • Figure 2: MiraGe employs 3D flat parameterized Gaussians in 3D space to encode 2D images, representing each flat Gaussian as three points, forming a cloud of triangles called a triangle soup. This representation enables real-time manipulation of the 3D triangle/point clouds, allowing for flexible, real-world modifications. The model seamlessly integrates with a physics engine, enhancing its applicability in dynamic environments.
  • Figure 3: Parameterized flat 3D Gaussians provide a powerful representation of 2D images, enabling flexible editing in 3D space. Triangle Soup can be animated using tools like Blender. The colored lines depict the motion paths of 10 randomly selected points during the simulation.
  • Figure 4: Two images were encoded using the MiraGe model on distinct planes within a 3D space. This setup allows for seamless integration of the encoded images, resulting in a collage-like composition. Moreover, the model facilitates editing capabilities, as illustrated here, with modifications to the background image (the rear plane).
  • Figure 5: Visual comparison of two Gaussian-based methods for 2D image reconstruction. From left to right, the columns display the ground truth image, the GaussianImage reconstruction, and the MiraGe reconstruction. The bottom row illustrates the differences between the ground truth image and the results of each method.
  • ...and 16 more figures