Evaluation of Deformable Image Registration under Alignment-Regularity Trade-off
Vasiliki Sideri-Lampretsa, Daniel Rueckert, Huaqi Qiu
TL;DR
Deformable image registration (DIR) must balance accurate alignment with deformation regularity, a trade often ignored in evaluation. The authors propose ARC curves—Alignment-Regularity Characteristics Curves—that track an alignment metric against a regularity metric across a continuum of regularization weights, enabling continuous, holistic comparisons. They further introduce HyperMorph, a HyperNetwork-based amortization scheme, to interpolate across the full regularization spectrum and accelerate ARC construction across architectures and transformation models. Experiments on Learn2Reg datasets (OASIS brain MRI and NLST lung CT) reveal that maximal alignment can occur with markedly different deformations and that ARC curves expose nuances that discrete-point metrics miss. The work provides practical guidelines for practitioners and researchers to evaluate and select DIR methods using the ARC framework and discusses future directions such as AUC-ARC and broader datasets.
Abstract
Evaluating deformable image registration (DIR) is challenging due to the inherent trade-off between achieving high alignment accuracy and maintaining deformation regularity. However, most existing DIR works either address this trade-off inadequately or overlook it altogether. In this paper, we highlight the issues with existing practices and propose an evaluation scheme that captures the trade-off continuously to holistically evaluate DIR methods. We first introduce the alignment regularity characteristic (ARC) curves, which describe the performance of a given registration method as a spectrum under various degrees of regularity. We demonstrate that the ARC curves reveal unique insights that are not evident from existing evaluation practices, using experiments on representative deep learning DIR methods with various network architectures and transformation models. We further adopt a HyperNetwork based approach that learns to continuously interpolate across the full regularization range, accelerating the construction and improving the sample density of ARC curves. Finally, we provide general guidelines for a nuanced model evaluation and selection using our evaluation scheme for both practitioners and registration researchers.
