Exploring selective image matching methods for zero-shot and few-sample unsupervised domain adaptation of urban canopy prediction
John Francis, Stephen Law
TL;DR
The paper tackles domain shift in remote-sensing canopy estimation by adapting a Chicago-trained multi-task UNet to London imagery without retraining complex domain-adaptation modules. It proposes and evaluates selective image-matching data-based unsupervised domain adaptation methods, including histogram matching in RGB/LAB, Fourier-domain adaptation (FDA), and pixel distribution adaptation (PDA), plus CycleGAN and a no-change baseline. In zero-shot and small-sample fine-tuning experiments on 812 London and 812 Chicago 1m RGB images, PDA excels for canopy-cover prediction (IoU up to 0.5131 in zero-shot; 0.7014 with fine-tuning) and FDA excels for canopy-height prediction (MAE down to 0.5864 in zero-shot; 0.5547 with fine-tuning). Transformed target images are produced by $I_{tgt \rightarrow src}=T_{tgt \rightarrow src}(I_{tgt})$ and optimized with $L_{total}=\gamma L_{canopy}+\lambda L_{height}$, and overall the findings show simple, low-resource image-matching can meaningfully improve predictions in new domains, sometimes outperforming a CycleGAN, which is valuable when retraining is costly.
Abstract
We explore simple methods for adapting a trained multi-task UNet which predicts canopy cover and height to a new geographic setting using remotely sensed data without the need of training a domain-adaptive classifier and extensive fine-tuning. Extending previous research, we followed a selective alignment process to identify similar images in the two geographical domains and then tested an array of data-based unsupervised domain adaptation approaches in a zero-shot setting as well as with a small amount of fine-tuning. We find that the selective aligned data-based image matching methods produce promising results in a zero-shot setting, and even more so with a small amount of fine-tuning. These methods outperform both an untransformed baseline and a popular data-based image-to-image translation model. The best performing methods were pixel distribution adaptation and fourier domain adaptation on the canopy cover and height tasks respectively.
