Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part I: Homogeneous Diffusion Inpainting

Niklas Kämper; Vassillen Chizhov; Joachim Weickert

Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part I: Homogeneous Diffusion Inpainting

Niklas Kämper, Vassillen Chizhov, Joachim Weickert

TL;DR

This paper tackles the computing bottleneck of 4K inpainting-based image compression by developing a fast solver for homogeneous diffusion inpainting. It introduces a full multigrid solver embedded with optimized restricted additive Schwarz domain decomposition to exploit GPU parallelism, including a novel downsampling strategy for the Dirichlet boundary mask. The method achieves real-time performance at 4K, delivering over 60 frames per second for sparse data, and demonstrates substantially higher speed and comparable or better reconstruction quality than CG-based solvers across densities and resolutions. The approach shows strong practical impact for decoding inpainting-based codecs and suggests avenues for extending to more advanced inpainting operators and encoding optimizations in future work.

Abstract

In recent years inpainting-based compression methods have been shown to be a viable alternative to classical codecs such as JPEG and JPEG2000. Unlike transform-based codecs, which store coefficients in the transform domain, inpainting-based approaches store a small subset of the original image pixels and reconstruct the image from those by using a suitable inpainting operator. A good candidate for such an inpainting operator is homogeneous diffusion inpainting, as it is simple, theoretically well-motivated, and can achieve good reconstruction quality for optimized data. However, a major challenge has been to design fast solvers for homogeneous diffusion inpainting that scale to 4K image resolution ($3840 \times 2160$ pixels) and are real-time capable. We overcome this with a careful adaptation and fusion of two of the most efficient concept from numerical analysis: multigrid and domain decomposition. Our domain decomposition algorithm efficiently utilizes GPU parallelism by solving inpainting problems on small overlapping blocks. Unlike simple block decomposition strategies such as the ones in JPEG, our approach yields block artifact-free reconstructions. Furthermore, embedding domain decomposition in a full multigrid scheme provides global interactions and allows us to achieve optimal convergence by reducing both low- and high-frequency errors at the same rate. We are able to achieve 4K color image reconstruction at more than $60$ frames per second even from very sparse data - something which has been previously unfeasible.

Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part I: Homogeneous Diffusion Inpainting

TL;DR

Abstract

pixels) and are real-time capable. We overcome this with a careful adaptation and fusion of two of the most efficient concept from numerical analysis: multigrid and domain decomposition. Our domain decomposition algorithm efficiently utilizes GPU parallelism by solving inpainting problems on small overlapping blocks. Unlike simple block decomposition strategies such as the ones in JPEG, our approach yields block artifact-free reconstructions. Furthermore, embedding domain decomposition in a full multigrid scheme provides global interactions and allows us to achieve optimal convergence by reducing both low- and high-frequency errors at the same rate. We are able to achieve 4K color image reconstruction at more than

frames per second even from very sparse data - something which has been previously unfeasible.

Paper Structure (31 sections, 16 equations, 10 figures, 3 algorithms)

This paper contains 31 sections, 16 equations, 10 figures, 3 algorithms.

Introduction
Our Contribution
Related Work
Domain Decomposition in Image Processing
Multigrid Methods
Green's Functions
Finite Elements
Video Coding
Paper Structure
Homogeneous Diffusion Inpainting
Continuous Formulation
Discrete Formulation
Practical Considerations
Optimized Restricted Additive Schwarz
Discrete Optimized Restricted Additive Schwarz
...and 16 more sections

Figures (10)

Figure 1: Domain Decomposition Example. The domain is divided into four overlapping subdomains. The subdomain $\Omega_1$ is highlighted in blue.
Figure 1: Full Multigrid Scheme. Example with four resolution layers and a single V-cycle for each level. The doubled lines represent the FMG prolongations to initialize the V-cycle for the next finer level.
Figure 1: Sparse Inpainting with 5 % Known Data. Images 1 to 6 of our test dataset of size $3840 \times 2160$ with an 5% optimized inpainting mask and the corresponding inpainting. Photos by J. Weickert.
Figure 2: Visualization of the Overlap Weights. The weights $\bm{D}_1$ and $\bm{D}_2$ in the overlap of two blocks $\Omega_1$ and $\Omega_2$ are chosen, such that they always add up to 1 at each pixel.
Figure 2: Reduced Full Multigrid Scheme. The initial guess is constructed in a coarse-to-fine manner, also known as one-way or cascadic multigrid. Then we continue with additional V-cycle correction steps (a single V-cycle is visualized above).
...and 5 more figures

Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part I: Homogeneous Diffusion Inpainting

TL;DR

Abstract

Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part I: Homogeneous Diffusion Inpainting

Authors

TL;DR

Abstract

Table of Contents

Figures (10)