Table of Contents
Fetching ...

Color Mismatches in Stereoscopic Video: Real-World Dataset and Deep Correction Method

Egor Chistov, Nikita Alutis, Dmitriy Vatolin

TL;DR

This paper addresses color mismatches between stereoscopic views, which can cause viewer discomfort, by introducing a real-world beam-splitter dataset and a deep multiscale network that leverages stereo correspondences for color transfer. The method uses an optical-flow-based correspondence mechanism, Efficient feature extraction, and a four-layer U-Net to fuse matched and neighboring information, with training under both deterministic and probabilistic distortion models. Experiments show strong performance on artificial distortions but reveal a domain shift when evaluated on real-world data, where simple global color transfers can outperform more complex, non-global methods. The work emphasizes the need for more realistic color-distortion models to improve generalization to real-world stereoscopic color mismatches and provides a publicly available dataset to drive future progress.

Abstract

Stereoscopic videos can contain color mismatches between the left and right views due to minor variations in camera settings, lenses, and even object reflections captured from different positions. The presence of color mismatches can lead to viewer discomfort and headaches. This problem can be solved by transferring color between stereoscopic views, but traditional methods often lack quality, while neural-network-based methods can easily overfit on artificial data. The scarcity of stereoscopic videos with real-world color mismatches hinders the evaluation of different methods' performance. Therefore, we filmed a video dataset, which includes both distorted frames with color mismatches and ground-truth data, using a beam-splitter. Our second contribution is a deep multiscale neural network that solves the color-mismatch-correction task by leveraging stereo correspondences. The experimental results demonstrate the effectiveness of the proposed method on a conventional dataset, but there remains room for improvement on challenging real-world data.

Color Mismatches in Stereoscopic Video: Real-World Dataset and Deep Correction Method

TL;DR

This paper addresses color mismatches between stereoscopic views, which can cause viewer discomfort, by introducing a real-world beam-splitter dataset and a deep multiscale network that leverages stereo correspondences for color transfer. The method uses an optical-flow-based correspondence mechanism, Efficient feature extraction, and a four-layer U-Net to fuse matched and neighboring information, with training under both deterministic and probabilistic distortion models. Experiments show strong performance on artificial distortions but reveal a domain shift when evaluated on real-world data, where simple global color transfers can outperform more complex, non-global methods. The work emphasizes the need for more realistic color-distortion models to improve generalization to real-world stereoscopic color mismatches and provides a publicly available dataset to drive future progress.

Abstract

Stereoscopic videos can contain color mismatches between the left and right views due to minor variations in camera settings, lenses, and even object reflections captured from different positions. The presence of color mismatches can lead to viewer discomfort and headaches. This problem can be solved by transferring color between stereoscopic views, but traditional methods often lack quality, while neural-network-based methods can easily overfit on artificial data. The scarcity of stereoscopic videos with real-world color mismatches hinders the evaluation of different methods' performance. Therefore, we filmed a video dataset, which includes both distorted frames with color mismatches and ground-truth data, using a beam-splitter. Our second contribution is a deep multiscale neural network that solves the color-mismatch-correction task by leveraging stereo correspondences. The experimental results demonstrate the effectiveness of the proposed method on a conventional dataset, but there remains room for improvement on challenging real-world data.
Paper Structure (6 sections, 8 equations, 5 figures, 1 table)

This paper contains 6 sections, 8 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Frame #1,200 from video “VR180 Cameras with Daydream,” taken by Google (https://www.youtube.com/watch?v=TH_MMXinRsA), contains color mismatches.
  • Figure 2: Two our setups in the top row and dataset filming and postprocessing pipeline in the bottom row. A beam splitter divides incoming light, which is then captured by the left camera and left ground-truth camera. The right camera captures the right ground-truth view. The final dataset frames have undergone spatial and temporal alignment.
  • Figure 3: Our filming setup in the top left corner and all 14 scenes from our real-world dataset. It contains various objects that produce color mismatches. The dataset is publicly available at the project page.
  • Figure 4: Results of the color transfer from the reference image to the target image on a stereopair from InStereo2K bao2020instereo2k. The hue of the target image was adjusted using the maximum magnitude ($+0.5$). Neural network-based methods (Croci et alcroci2021deep and ours), that were trained on such distortions, have successfully transferred the colors.
  • Figure 5: Visualization of our method's failures on a stereopairs from our real-world dataset. The top row contains a scene, that is relatively simple for an optical flow algorithm, and the bottom row contains a difficult scene for it.