A Deep Single Image Rectification Approach for Pan-Tilt-Zoom Cameras
Teng Xiao, Qi Hu, Qingsong Yan, Wei Liu, Zhiwei Ye, Fei Deng
TL;DR
The paper addresses the challenge of rectifying wide-angle PTZ camera images from a single frame, where nonlinear distortions degrade visual tasks. It introduces FDBW-Net, a framework that combines a forward distortion-based data synthesis pipeline with a backward warping–driven rectification network: a pyramid context encoder extracts multi-scale features, BWEM predicts precise backward warping flows with attention, and a multi-scale decoder with a layer-by-layer rectification module progressively restores distortion while a discriminator enforces realism. Key contributions include the forward distortion-based synthesis to preserve details, the BWEM–LLRM architecture for high-fidelity geometric restoration, and extensive experiments on public, synthetic AirSim PTZ, and real PTZ datasets demonstrating state-of-the-art distortion rectification and strong generalization. The approach offers practical impact for PTZ camera deployments by enabling reliable, detail-preserving rectification in diverse real-world scenarios.
Abstract
Pan-Tilt-Zoom (PTZ) cameras with wide-angle lenses are widely used in surveillance but often require image rectification due to their inherent nonlinear distortions. Current deep learning approaches typically struggle to maintain fine-grained geometric details, resulting in inaccurate rectification. This paper presents a Forward Distortion and Backward Warping Network (FDBW-Net), a novel framework for wide-angle image rectification. It begins by using a forward distortion model to synthesize barrel-distorted images, reducing pixel redundancy and preventing blur. The network employs a pyramid context encoder with attention mechanisms to generate backward warping flows containing geometric details. Then, a multi-scale decoder is used to restore distorted features and output rectified images. FDBW-Net's performance is validated on diverse datasets: public benchmarks, AirSim-rendered PTZ camera imagery, and real-scene PTZ camera datasets. It demonstrates that FDBW-Net achieves SOTA performance in distortion rectification, boosting the adaptability of PTZ cameras for practical visual applications.
