Parallax-tolerant Image Stitching via Segmentation-guided Multi-homography Warping
Tianli Liao, Ce Wang, Lei Li, Guangen Liu, Nan Li
TL;DR
This work addresses large parallax in image stitching by introducing a segmentation-guided multi-homography warping framework. It leverages the Segment Anything Model (SAM) to partition the target image into semantic contents, then partitions feature matches into multiple homography models via an energy-based fitting process; overlapping contents are labeled with the best-fitting homography, while non-overlapping regions are extrapolated with linearly blended, linearized warps. Warping is performed in a forward-backward manner with an error-buffer to handle occlusions, followed by linear blending to produce the final panorama. Quantitative and qualitative results on public parallax datasets show substantial improvements in alignment quality over state-of-the-art methods. Overall, the approach advances parallax-robust stitching by combining semantic segmentation with content-aware, multi-homography modeling and occlusion-aware rendering.
Abstract
Large parallax between images is an intractable issue in image stitching. Various warping-based methods are proposed to address it, yet the results are unsatisfactory. In this paper, we propose a novel image stitching method using multi-homography warping guided by image segmentation. Specifically, we leverage the Segment Anything Model to segment the target image into numerous contents and partition the feature points into multiple subsets via the energy-based multi-homography fitting algorithm. The multiple subsets of feature points are used to calculate the corresponding multiple homographies. For each segmented content in the overlapping region, we select its best-fitting homography with the lowest photometric error. For each segmented content in the non-overlapping region, we calculate a weighted combination of the linearized homographies. Finally, the target image is warped via the best-fitting homographies to align with the reference image, and the final panorama is generated via linear blending. Comprehensive experimental results on the public datasets demonstrate that our method provides the best alignment accuracy by a large margin, compared with the state-of-the-art methods. The source code is available at https://github.com/tlliao/multi-homo-warp.
