Photography Perspective Composition: Towards Aesthetic Perspective Recommendation
Lujian Yao, Siming Zheng, Xinbin Yuan, Zhuoxuan Cai, Pu Wu, Jinwei Chen, Bo Li, Peng-Tao Jiang
TL;DR
This work addresses the limitations of cropping-based photography composition by introducing Photography Perspective Composition (PPC), which leverages perspective transformation to reconfigure spatial relationships without moving subjects. It proposes an automated PPC dataset construction pipeline, a perspective-transformation video generation framework, and a Perspective Quality Assessment (PQA) model to evaluate multi-dimensional quality across visual, motion, and composition aesthetics, guiding both data filtering and RLHF optimization. Experiments demonstrate PPC effectiveness across single, multi-subject, landscape, and UAV-like scenes, with RLHF improving stability and alignment to human preferences, and the PQA model enabling scalable evaluation. Overall, PPC offers a practical pathway for ordinary users to achieve professional-style composition and paves the way for further research in perspective-aware, data-driven computational photography, including extensions to AR and high-fidelity video generation.
Abstract
Traditional photography composition approaches are dominated by 2D cropping-based methods. However, these methods fall short when scenes contain poorly arranged subjects. Professional photographers often employ perspective adjustment as a form of 3D recomposition, modifying the projected 2D relationships between subjects while maintaining their actual spatial positions to achieve better compositional balance. Inspired by this artistic practice, we propose photography perspective composition (PPC), extending beyond traditional cropping-based methods. However, implementing the PPC faces significant challenges: the scarcity of perspective transformation datasets and undefined assessment criteria for perspective quality. To address these challenges, we present three key contributions: (1) An automated framework for building PPC datasets through expert photographs. (2) A video generation approach that demonstrates the transformation process from less favorable to aesthetically enhanced perspectives. (3) A perspective quality assessment (PQA) model constructed based on human performance. Our approach is concise and requires no additional prompt instructions or camera trajectories, helping and guiding ordinary users to enhance their composition skills.
