Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration

Table of Contents

Overview

Paper: Meng et al., Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration (cvpr2024 open access or arxiv).

(Figures and tables in this post are from the original paper)

Multi-layer perceptron (MLP) without self-attention is one of efficient methods in terms of computational costs and memory usages but it doesn’t account for inductive bias.
To solve this problem, authors proposed a registration method called correlation-aware MLP-based (CorrMLP) network.
Two key points are a CNN-based hierarchical feature extraction encoder and a correlation-aware coarse-to-fine registration decoder.
Correlation-aware multi-window MLP (CMW-MLP) blocks in the later decoder treat multiple size features extracted by the former encoder, which makes CorrMLP possible to utilize fine-grained long-range features on full resolution images.
CorrMLP can solve local non-linear and wide range deformations for image registration.

They compared CorrMLP to other methods incluing SyN, NifyReg, VoxelMorph, Swin-CoxelMorph, TransMorph, TransMatch, LapIRN, ULAE-net, Dual-PRNet++, SDHNet, NICE-Net and NICE-Trans.
Dataset: ADNI, ABIDE, ADHD, IXI, Mindboggle and Buckner for 3D inter-patient brain image registration, ACDC for 4D intra-patient cardiac image registration.

CorrMLP performed better than other methods (transformers, CNNs and MLPs) on the DSC and NJD indices.
The both image-level and step-level correlations are necessary for the CorrMLP’s best performans.

Inductive bias is implemented by a combination of CNN (in the former encoder) and global average pooling (in the later decoder), I guess.
MLPs doesn’t take into account indactive bias.
Transformer has the capability to capture wide-range features in an image, but it may difficult to treat fine-grained long-range dependences at the input image size since downsample features are commonly used to reduce computation and memory consumption.