Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces
Evangelos Skartados, Mehmet Kerim Yucel, Bruno Manganelli, Anastasios Drosou, Albert Saà-Garriga
TL;DR
This work defines and formalizes the scene exploration problem for NeRF-based scene representations, aiming to discover camera poses that render views satisfying user-defined criteria. It introduces two naive baselines, Guided Random Search and Pose Interpolation-based Search, and proposes Evolution-Guided Pose Search (EGPS) as a task-agnostic optimization method leveraging a genetic-algorithm framework to balance accuracy and exploration. The framework is evaluated on real-world scenes across criteria such as photo-composition, saliency, and image quality, demonstrating that EGPS generally outperforms baselines in generating diverse, high-quality novel views. By enabling efficient NeRF-scene space exploration, the approach has practical implications for content creation, multimedia production, and VR/AR applications, and points to future work on robust criteria, multi-criteria optimization, and temporal pose trajectories.
Abstract
Neural Radiance Fields (NeRF) have quickly become the primary approach for 3D reconstruction and novel view synthesis in recent years due to their remarkable performance. Despite the huge interest in NeRF methods, a practical use case of NeRFs has largely been ignored; the exploration of the scene space modelled by a NeRF. In this paper, for the first time in the literature, we propose and formally define the scene exploration framework as the efficient discovery of NeRF model inputs (i.e. coordinates and viewing angles), using which one can render novel views that adhere to user-selected criteria. To remedy the lack of approaches addressing scene exploration, we first propose two baseline methods called Guided-Random Search (GRS) and Pose Interpolation-based Search (PIBS). We then cast scene exploration as an optimization problem, and propose the criteria-agnostic Evolution-Guided Pose Search (EGPS) for efficient exploration. We test all three approaches with various criteria (e.g. saliency maximization, image quality maximization, photo-composition quality improvement) and show that our EGPS performs more favourably than other baselines. We finally highlight key points and limitations, and outline directions for future research in scene exploration.
