Abstract
3D image deformation for medical image registration generally is a time-consuming task. This drawback slow down the image registration speed. Thin - Plate Spline (TPS) is an commonly used interpolation technique to deform images which assure the least bending energy. This paper proposes a parallel implementation of 3D image deformation using Thin-Plate Splines and tri-linear interpolation which is based on CPU + GPU heterogeneous platform. We address the computation model accounting for thread partition, memory allocation and address coalescing in memory accesses to analyze the performance of parallel algorithm on GPU. Using CUDA C and NIVIDA Tesla C2050 with 448 paralleled threads, we achieve an approximately 70-fold increase in speed in 3D medical image deformation, which shows higher speed than CPU on the final result. Experiments show that this GPU computation model is a practical way to accelerate image deformation.