Decoder-Only Image Registration
Xi Jia, Wenqi Lu, Xinxing Cheng, Jinming Duan
TL;DR
This work argues that encoder learning is often unnecessary for unsupervised 3D medical image registration and introduces LessNet, a decoder-only network that predicts dense displacement fields from image pairs using handcrafted multi-scale pooling features. By eliminating the learnable encoder and leveraging a four-block decoder, LessNet achieves competitive Dice scores on brain MRI datasets (OASIS-1 and IXI) while dramatically reducing parameters, memory, and compute, including support for diffeomorphic variants via velocity-field exponentiation with scaling and squaring. The results show that a compact, decoder-centric design can match state-of-the-art performance with substantially lower resource requirements, though diffeomorphic advantages may be dataset-dependent. The paper highlights the potential of substituting learned encoders with handcrafted or pre-trained features to enable efficient, large-scale registration in practice.
Abstract
In unsupervised medical image registration, the predominant approaches involve the utilization of a encoder-decoder network architecture, allowing for precise prediction of dense, full-resolution displacement fields from given paired images. Despite its widespread use in the literature, we argue for the necessity of making both the encoder and decoder learnable in such an architecture. For this, we propose a novel network architecture, termed LessNet in this paper, which contains only a learnable decoder, while entirely omitting the utilization of a learnable encoder. LessNet substitutes the learnable encoder with simple, handcrafted features, eliminating the need to learn (optimize) network parameters in the encoder altogether. Consequently, this leads to a compact, efficient, and decoder-only architecture for 3D medical image registration. Evaluated on two publicly available brain MRI datasets, we demonstrate that our decoder-only LessNet can effectively and efficiently learn both dense displacement and diffeomorphic deformation fields in 3D. Furthermore, our decoder-only LessNet can achieve comparable registration performance to state-of-the-art methods such as VoxelMorph and TransMorph, while requiring significantly fewer computational resources. Our code and pre-trained models are available at https://github.com/xi-jia/LessNet.
