UNIKD: UNcertainty-filtered Incremental Knowledge Distillation for Neural Implicit Representation
Mengqi Guo, Chen Li, Hanlin Chen, Gim Hee Lee
TL;DR
This work tackles incremental learning for Neural Implicit Representations (NIRs) to enable streaming-data 3D reconstruction and view synthesis without storing past data. It introduces a self-contained student–teacher framework augmented with a random inquirer and an uncertainty-based filter to perform knowledge distillation from past models onto a current learner. The approach demonstrates strong improvements over baselines on NeRF- and MonoSDF-based tasks, achieving competitive results with batch-trained upper bounds while using minimal memory. By enabling continual learning across large-scale scenes and diverse NIRs, the method offers practical gains for real-world streaming perception systems.
Abstract
Recent neural implicit representations (NIRs) have achieved great success in the tasks of 3D reconstruction and novel view synthesis. However, they require the images of a scene from different camera views to be available for one-time training. This is expensive especially for scenarios with large-scale scenes and limited data storage. In view of this, we explore the task of incremental learning for NIRs in this work. We design a student-teacher framework to mitigate the catastrophic forgetting problem. Specifically, we iterate the process of using the student as the teacher at the end of each time step and let the teacher guide the training of the student in the next step. As a result, the student network is able to learn new information from the streaming data and retain old knowledge from the teacher network simultaneously. Although intuitive, naively applying the student-teacher pipeline does not work well in our task. Not all information from the teacher network is helpful since it is only trained with the old data. To alleviate this problem, we further introduce a random inquirer and an uncertainty-based filter to filter useful information. Our proposed method is general and thus can be adapted to different implicit representations such as neural radiance field (NeRF) and neural surface field. Extensive experimental results for both 3D reconstruction and novel view synthesis demonstrate the effectiveness of our approach compared to different baselines.
