DOI QR코드

DOI QR Code

Respiratory Motion Correction on PET Images Based on 3D Convolutional Neural Network

  • Hou, Yibo (School of Information Engineering and Automation, Kunming University of Science and Technology) ;
  • He, Jianfeng (School of Information Engineering and Automation, Kunming University of Science and Technology) ;
  • She, Bo (PET/CT Center, First People's Hospital of Yunnan)
  • Received : 2021.10.13
  • Accepted : 2022.06.30
  • Published : 2022.07.31

Abstract

Motion blur in PET (Positron emission tomography) images induced by respiratory motion will reduce the quality of imaging. Although exiting methods have positive performance for respiratory motion correction in medical practice, there are still many aspects that can be improved. In this paper, an improved 3D unsupervised framework, Res-Voxel based on U-Net network was proposed for the motion correction. The Res-Voxel with multiple residual structure may improve the ability of predicting deformation field, and use a smaller convolution kernel to reduce the parameters of the model and decrease the amount of computation required. The proposed is tested on the simulated PET imaging data and the clinical data. Experimental results demonstrate that the proposed achieved Dice indices 93.81%, 81.75% and 75.10% on the simulated geometric phantom data, voxel phantom data and the clinical data respectively. It is demonstrated that the proposed method can improve the registration and correction performance of PET image.

Keywords

1. Introduction

Lung cancer is one of the malignant tumors with the fastest increasing morbidity, which seriously threatens human life and health [1]. Positron emission tomography (PET) is an important tool for the early diagnosis of malignant tumors [2]. Patients’ respiratory motion will cause the PET imaging artifacts, which will affect the imaging quality. Therefore, it is important for the clinic diagnosis to correct the artifacts.

To solve this problem, He et al. [3] observed the number of photons collected by PET detector ring changes with the position of the respiratory motion, and then proposed a method of system sensitivity gating for the motion correction. Büther et al. [4] used a list-mode data based gated centroid method and spatially segmented tumors by a sinogram for correction of PET images of liver and lung tumors. Although the above approaches can effectively achieve correction for respiratory motion artifacts, some require significant time consumption or do not make efficient use of the information contained in the data.

With the development of deep learning convolutional neural networks (CNN) [5], it has demonstrated its powerful feature extraction capabilities in various medical image processing, however, there are many aspects that can be improved. For instance, He et al. [6] and Ibtehaz et al. [7] proposed residual module and multiple residual module, which can better extract features in deep network model, and effectively solve the problem of gradient disappearance. The spatial transformer network (STN) for image registration was proposed by Jaderberg et al. [8] . STN enables spatial operations on data in the network, and spatial transformer according to the image to be registered. Wu et al. [9] proposed a stacked autoencoder (SAE) to identify the inherent features in the fixed image to register the image. Sokooti et al. [10] proposed a multi-scale registration framework that integrates more global information. Subsequently, Vos et al. [11] and Li et al. [12] successively proposed a self-supervised full convolutional network (FCN) registration method, but these methods are transformations in a small range without focusing on the entire image. Balakrishnan et al. [13] proposed a fast unsupervised medical image registration method, VoxelMorph, which learns a parameterized registration function using a convolutional neural network (CNN), however, this method has some defects in the registration of complex medical images. A commonly used registration method is AntsSyNQuick [14], which uses geometric or variational methods for registration, but it is very computationally intensive. Kuang et al. [15] proposed FAIM which extends VoxelMorph by improving the accuracy of results with penalty loss on negative Jacobian determinants, only from the aspect of image folding. Hu et al. [16] designed a two-stream architecture, Dual-PRNet which extends VoxelMorph by computing multi-scale registration fields from convolutional feature pyramids, but the correlation between images is ignored.

Although the exiting mentioned above methods for the motion correction have obviously improvements, there are still some limitations. In this paper, we propose a framework as Res-Voxel, for respiratory motion correction of PET imaging by incorporating multiple residual blocks into VoxelMorph network, which may achieve a better performance of the motion correction.

2. Method

2.1 Structure of the Correction System

In general, most of registration algorithms are to warp and transform the voxels on the moving image to correspond the voxels on the fixed image[17]:

\(\begin{aligned}\widehat{\Phi}=\arg \min _{\phi} \mathcal{L}(F, M, \phi)\end{aligned}\)       (1)

𝓛(𝐹, 𝑀, 𝜙) = 𝓛sim(𝐹, 𝑀(𝜙)) + 𝜆𝓛smooth(𝜙)       (2)

Where 𝐹 is fixed image, 𝑀 is moving image, 𝜙 is displacement vector field (DVF) which is the offset of each voxel in 𝐹 to the corresponding voxel in 𝑀, 𝑀(𝜙) is the moving image of 𝑀 after distortion transformation, 𝓛sim(,) is used to measure the similarity between 𝐹 and 𝑀(𝜙), 𝓛smooth(·) is the regularization acting on 𝜙, and 𝜆 is the regularization parameter.

As shown in Fig. 1, the structure of network for image registration is given. This paper uses a convolutional neural network to model the function 𝑔𝜃(𝐹, 𝑀) = 𝜙. In the experiment, 𝑀 and 𝐹 are used as the input of Res-Voxel, both are grayscale images. The registration domain 𝜙 is learned through the parameter 𝜃. The output 𝜙 of Res-Voxel is used as the input of the STN that the spatial transformer function is used to transform 𝑀(𝑝) into 𝑀(𝜙(𝑝)), making 𝑀(𝜙(𝑝)) a structural position similar to 𝐹(𝑝) . Finally, the difference between 𝑀(𝜙(𝑝)) and 𝐹(𝑝) is calculated by the Loss Function, thus the parameter 𝜃 is updated.

E1KOBZ_2022_v16n7_2191_f0001.png 이미지

Fig. 1. Overall registration framework. The region of red dash line is the Res-Voxel model, which is used to predict the deformation field. The region of blue dash line is the spatial transformer network, which is used to obtain the corrected image.

2.2 Res-Voxel

The registration method based on deep learning is used for the deformation field learning through the network based on U-net [18][19], which is a medical image registration framework VoxelMorph, and has achieved results on brain MR image registration in three dimensions [13]. In PET image artifact correction, we focus on the correspondence between the artifact/moving image of the lung and the fixed image. In most cases, the artifact generated by respiratory motion has irregular and complex characteristics, and the U-net has poor prediction effect on its deformation field. Szegedy et al. [20] proposed Inception architecture. Inception blocks utilized convolutional layers of varying kernel sizes in parallel, to inspect the points of interest in images from different scales. These perceptions, obtained at different scales, were combined together and passed on deeper into the network, which the feature extraction ability of the network is improved. However, the network has still a limitation that the 3×3 and 5×5 convolution operations increase the computational burden of the network as shown in Fig. 2(a). We thus use the receptive field size of two 3× 3 convolution kernels to take place of a 5×5 convolution kernel do decrease the computational burden as shown in Fig. 2(b). Specifically, for an input 5×5 image as shown in Fig. 2(a), an input image feature map is obtained after a 5×5 convolution operation that requires the total number of parameters being (5×5)×channels. In contrast, an input image feature map is obtained after a 3×3 convolution operation that requires the total number of parameters being (3×3)×channels. As result, two 3×3 (2×3×3) convolution kernels obviously require less computation as shown in Fig. 2(b), the total number of parameters being 2×(3×3)×channels compare to one 5×5 convolution core, which the parameters reduce by about 30%. Fig. 3 shows the amount of computation of two convolution kernels on images of different sizes. The complexity of both is O(n2). For an image of size n, the number of convolutions of one 5×5 convolution kernel is (n-4)2, and the amount of computation is 5×5×(n-4)2 ; the number of convolutions of two 3×3 convolution kernels is (n-2)2+(n-2-2)2, and the amount of computation is 2×(3×3)×[(n-2)2+(n-2-2)2]. When the image side length is greater than 10, the computational complexity of two 3×3 convolutions is smaller than that of one 5×5 convolution.

E1KOBZ_2022_v16n7_2191_f0002.png 이미지

Fig. 2. The two 3×3 taking place of 5×5 convolution operations decrease the computational burden of the network

E1KOBZ_2022_v16n7_2191_f0003.png 이미지

Fig. 3. Computation of different convolution kernels VS images of different sizes

Furthermore, we also supplement relevant experimental results to support that the proposed model requires fewer parameters. As shown in Table 1, we simply use the geometric phantom dataset to compare the number of parameters and the training time of our method with the baseline network under the running of 1000 iterations. We can see the proposed method Res-Voxel reduces the number of parameters by about 7%, so we choose a structure similar to Fig. 2b to construct the registration network.

Table 1. Comparison of parameter results under 1000 iterations

E1KOBZ_2022_v16n7_2191_t0001.png 이미지

We add multiple residual blocks at each stage to enable it to extract spatial features from different scales. The structure is shown in Fig. 4.The fixed image (𝐹) and the moving image (𝑀) are connected into a 2-channel 3D image in the channel dimension as the input of the network. The input is of size 128×128×32×2, and the number of channels in each block is marked in Fig. 4, 3D convolution is used in both encoding and decoding stages, the size of the convolution kernel is 3×3×3, and each convolution layer is followed by an Leaky ReLU as activation function. The convolution step size (stride=2) is used to halve the spatial dimension until the smallest layer is reached. In the decoding stage, convolution, upsampling3D and skip connection are alternately used, and finally gain the registration domain 𝜙 as output. In experiment, the output size is 128×128×32×3.

E1KOBZ_2022_v16n7_2191_f0013.png 이미지

Fig. 4. The structure of Res-Voxel network as shown the region of red dash line in Fig. 1

2.2.1 Multiple Residual Block

Based on the structure and advantages described in Fig. 2, we built this module. By adding a residual connection such as path a, as well as to introduce 1×1×1 convolutional layers such as path b, we call this arrangement a Multiple residual block, as shown in Fig. 5:

E1KOBZ_2022_v16n7_2191_f0004.png 이미지

Fig. 5. Multiple residual block as shown the region of red dash line in Fig. 3.

In the block, the input image passes through two consecutive 3×3×3 convolution, splicing with previous one result, and then spliced with the 1×1×1 convolutional image, and finally the dimensionality reduction operation is performed through a 3×3×3 convolution with a step size of 2. As this structure, image features are extracted through a 3×3×3 convolution operation, and some additional spatial information can be obtained by an 1×1×1 convolution operation. The entire block can effectively avoid the problem of excessive parameters, and the image features are well extracted at the same time.

2.3 Spatial Transformer Network

The spatial transformer network [8] designed in this paper is used to calculate 𝑀(𝜙). The structure is shown in Fig. 6:

E1KOBZ_2022_v16n7_2191_f0005.png 이미지

Fig. 6. Spatial Transformer Networks as shown the region of blue dash line in Fig. 1

The network is composed of input, Grid generator, Sampler and output. The input is composed of moving image 𝑀 and registration 𝜙, and the output is moved image 𝑀(𝜙). The Grid generator is used to calculate the coordinates in 𝑀, according to the coordinates in 𝑀(𝜙) and the parameter 𝜙, because of each voxel value in 𝑀(𝜙) must be extracted from 𝑀 and then filled. The Sampler is used to fill 𝑀(𝜙) according to the coordinates obtained in the Grid generator and the corresponding voxel value in 𝑀. For each voxel 𝑝, the position of the voxel 𝜙(𝑝) calculated in 𝑀 may be a decimal, while the voxel value should be integer, so that linear interpolation is performed on the eight adjacent voxel in the experiment, which is as follows:

\(\begin{aligned}M(\phi(p))=\sum_{q \in Z(\emptyset(p))} M(q) \prod_{i \in\{x, y, z\}}\left(1-\left|\phi_{i}(p)-q_{i}\right|\right)\end{aligned}\)      (3)

where the 𝑍(𝜙(𝑝)) is the adjacent voxel of 𝜙(𝑝), and 𝜙𝑖(𝑝) represents the coordinates of the corresponding point found in 𝑀 for the i-th point in 𝑀(𝜙).

2.4 Loss Function and Evaluation Indicators

In the experiment, AdaBound [21] is used as the optimization algorithm of the model. The parameters are set by the learning rate with 0.001, the exponential attenuation rate of the first-order moment estimation with 0.9, the decay rate of the second-order moment exponent with 0.99, and the disturbance value with 10E-8.

A loss function 𝐿 is required in the optimization algorithm AdaBound, and mean square error (MSE) is used. The MSE function curve is continuous and conducive to derivation, which is convenient for the use of the AdaBound optimization algorithm. Moreover, as the error decreases, the gradient also decreases, which is conducive to convergence and makes the result more stable. The degree of data transformation is evaluated by the expected value of the square of the difference between the estimated value of the parameter and the true value of the parameter. In this paper, it is expressed as the difference between the corresponding voxels of the image 𝑀(∅) after distortion and the fixed image 𝐹.

Since a large deformation may occur after the linear interpolation distortion transformation 𝑀(∅), a bending energy penalty is added to make it more smooth, which is:

\(\begin{aligned}L_{\text {smooth }}(\phi)=\sum_{p \in \Omega}\left\|\nabla \emptyset\left(p_{i}\right)\right\|^{2}\end{aligned}\)       (4)

The final loss function is:

𝐿(𝐹, 𝑀, ∅) = 𝑀𝑆𝐸(𝐹, 𝑀, ∅) + 𝐿smooth(𝜙)       (5)

We use Dice similarity coefficient (Dice) [22] , Intersection Over Union (IOU) [23] and Hausdorff Distance (HD) [24] to evaluate the network performance. Both Dice and IOU are used to measure the similarity of two sets. In experiments, they are used to evaluate the result of registration and correction. The larger the value of Dice and IOU are, the more similar is between the correction image and the fixed image. The HD is used to measure the boundaries. The smaller the HD value, the smaller the difference is between the correction image and the fixed image.

3. Experiments

3.1 Dataset

In this paper, a total of 120 geometric phantoms of small cylinders under different frequencies and motion directions are simulated for simple respiratory motion by GATE (GEANT4 Application Tomographic Emission) [25]. The respiratory motion amplitude of each simulation is between -3cm and 3cm, the motion period is controlled between 4s and 5s, and the radioactivity is 3000KBq, the size of the image is 128×128×16, and the resolution of the image is 3.125mm×3.125mm×4.25mm.

In addition, 100 cases of voxel phantoms with different respiratory frequencies and different motion amplitudes are simulated by NCAT (NURBs [Non Uniform Rational B-Splines] Cardiac Torso) [26] that simulates the dynamic phantom of various organs of the body under respiratory motion. The respiratory motion amplitude of each model is set between 1cm and 3cm, the respiratory cycle is 5s, and the radioactive activity of the lungs is set to 3KBq, the radioactivity of the tumor is set to 20KBq, the size of the simulated image is 128×128×32, and the resolution of the image is 3.125mm×3.125mm×4.25mm. Specific parameter settings are shown in Table 2.

Table 2. Phantom parameter setting

E1KOBZ_2022_v16n7_2191_t0002.png 이미지

Fig. 7 shows the experimental process. The raw data generated by the GATE and NCAT simulation is in the file of root format. In the experiment, STIR [27] and OSEM algorithm are used to reconstruct the data. Then. the obtained data is cropped, zero-averaged and normalized, so that the voxel value of the image is in the range of 0~1, where the mean value of the data becomes 0, and the standard deviation is 1, which can make the gradient descent more effective and accelerate the convergence of the model. The geometric phantoms are divided into training set and test set according to the ratio of 8:2 (96 cases for the training set and 24 cases for the test set). The voxel phantoms are divided into training set and test set according to the ratio of 9:1 (90 cases for the training set and 10 cases for the test set).

E1KOBZ_2022_v16n7_2191_f0006.png 이미지

Fig. 7. Experimental process. The experimental datasets are divided into geometric phantoms, voxel phantoms and clinical data. In the training stage, the corresponding registration model is trained through the Res-Voxel model according to different datasets. In the test phase, input a new image to the model and output the corrected image.

3.2 Computation Environments

This paper uses Pycharm as the compiler. The programming language is python3.6, the experimental framework is TensorFlow, and the computer hardware configuration is Intel(R) Core(TM) i7-9700KF CPU@3.60 GHz, 64GB RAM, NVIDIA GeForce RTX 2080 Ti 11G graphic memory, 64 Bit Windows10 operating system.

4. Results

4.1 Geometric phantom correction

Geometric phantom was used to simulate simple respiration registration experiment. In the training phase, we input a fixed image and a series of moving images into the network, as shown in Fig. 1. In the test phase, we input a fixed image and a moving image into the trained registration model for end-to-end registration and correction. The following is the evaluation and analysis of the results of the 2D slices of the transverse, coronal and sagittal views of the geometric phantom.

Fig. 8 is a result of artifact correction for one of the samples. 1 to 3 lines are transverse, coronal and sagittal images respectively. The columns from left to right are moving images with artifacts, fixed images without artifacts. The experimental results are compared with the traditional VoxelMorph [13] and AntsSyNQuick [14] methods and the state-of-the-art FAIM [15] and Dual-PRNet [16] methods.

E1KOBZ_2022_v16n7_2191_f0007.png 이미지

Fig. 8. Correction results of geometry phantom

It can be seen from the Fig. 8 that there is a great artifact effect around the moving image in the first column, the small cylinder becomes blurred, and all methods after registration correction have almost no artifacts. VoxelMorph can correct most artifacts well, and Res-Voxel has a better correction effect on the periphery of the small cylinder, which also shows the effectiveness of the proposed.

In the experiment, Dice [22] , IOU [23] and HD [24] are used to quantitatively analyze the experimental results, compare and explain the images before and after registration as shown in Table 3:

Table 3. PET/CT geometry phantom evaluation results

E1KOBZ_2022_v16n7_2191_t0003.png 이미지

It can be concluded that the evaluation value after registration has been greatly improved compared with the value before registration, especially in the Dice and IOU indicators of the similarity measurement. Res-Voxel achieved 89.04%, 93.81% and 90.13% of the Dice assessment results in the transverse, coronal and sagittal planes, respectively. Res-Voxel consistently outperforms the other three methods. The difference with the fixed image is smaller, and the Dice value of the coronal plane is increased by about 0.90% compared with VoxelMorph.

4.2 Voxel phantom correction

In order that further verify the effectiveness of the Res-Voxel framework, the lung voxel phantom data is simulated on the NCAT from the basis of the internal tissues and organs of the real human body data. In the registration experiment of the voxel phantom, we divide a breathing cycle of 5s into 10 frames, each frame is 0.5s, and then select one of the images (the image of the sixth frame in this experiment) as a fixed image, input other frame images as moving images into the network model for registration correction, and finally merge the obtained results to realize the artifact correction of breathing movement. The result of registration correction is shown in Fig. 9-Fig. 11:

E1KOBZ_2022_v16n7_2191_f0008.png 이미지

Fig. 9. Transverse correction results of voxel phantom

E1KOBZ_2022_v16n7_2191_f0009.png 이미지

Fig. 10. Coronal correction results of voxel phantom

E1KOBZ_2022_v16n7_2191_f0010.png 이미지

Fig. 11. Sagittal correction results of voxel phantom

It can be seen from the Fig. 9- Fig. 11 that the moving image has artifacts both around the lung and around the tumor, making it difficult to observe its exact position and structure, compared with the fixed image. The image quality after registration correction is significantly improved, and Res-Voxel has a better correction effect on tumor artifacts than other methods. Especially in the coronal and sagittal planes, it has a better correction effect on the edges of organs and tumors. Specific related evaluation results are shown in Table 4.

Table 4. PET/CT voxel phantom evaluation results

E1KOBZ_2022_v16n7_2191_t0004.png 이미지

It can be drawn from the Table 4 that the evaluation results after correction are greatly improved compared with those before correction. The Dice indices of Res-Voxel in the transverse, coronal, and sagittal planes are 71.91%, 81.75%, and 83.25%, respectively, which are closer to the fixed image, and are approximately 1.02% higher than Dual-PRNet with better result on the obvious coronal plane. It also illustrates the effectiveness of the proposed approach.

4.3 Clinical data correction

In this section, relevant experiments are carried out on clinical data, and the experimental results are evaluated and analyzed, so as to verify the feasibility of the proposed method for the correction of respiratory motion artifacts.

The clinical data were collected from the PET / CT center of the first people's Hospital of Yunnan Province, China. All patients were examined on Philips Ingenuity TF PET / CT scanner, Philips Intellispace portal v7 0.4.20175 for post-treatment. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of the First People's Hospital of Yunnan Province (No. 2017YYLH035). Prior informed consent to participate was obtained from all participants.

A total of eight patient data were used in this experiment, and the original data size was 144×144×234 whole body PET images. The original PET image is cropped to a size of 128×128×60; through zero mean and normalization, the training speed of the network is improved; then it is linearly stretched to improve the contrast of the PET image and highlight the important information in the image. The preprocessed PET image is shown in Fig. 12.

E1KOBZ_2022_v16n7_2191_f0011.png 이미지

Fig. 12. PET image comparison before and after preprocessing

In this experiment, the PREVIEW image is selected as the fixed image, and the PREVIEW image is the predicted image generated by the PET/CT scanner using the gating method during the reconstruction process. Therefore, the position and shape of the corresponding organ tissue and tumor in the obtained image are relatively fixed, but the data utilization rate is low, and the signal-to-noise ratio of the obtained image is low, as shown in the Fixed image in Fig. 13.

E1KOBZ_2022_v16n7_2191_f0012.png 이미지

Fig. 13. Correction results of clinical data

From Fig. 13 compared with the fixed image, the moving image of the heart part has become blurred, and there is no obvious boundary, and the volume of the tumor is also larger than that of the fixed image because of the respiratory motion, and the corrected tumor is slightly reduced. Smaller, closer to a fixed image, and organs such as the heart become slightly clearer. In order to analyze the experimental results more objectively, the experiments are combined with the relevant evaluation index data to carry out relevant quantitative analysis of the experimental results. The specific results are shown in Table 5.

Table 5. Clinical data evaluation results

E1KOBZ_2022_v16n7_2191_t0005.png 이미지

As can be seen from Table 5 the similarity Dice values obtained by the registration results of the method in this paper in the transverse, coronal and sagittal planes are 77.86%, 75.10% and 74.66%, respectively. Increased by 6.98%, 7.32% and 12.21%. Combined with the results of subjective evaluation and quantitative analysis, it shows that the registration correction method in this paper has a certain correction effect on clinical PET respiratory motion images.

5. Conclusion

For intricate PET images suffering from noises and lack of clear boundaries, we proposed an unsupervised framework, Res-Voxel, for PET image artifact correction. The network integrates multiple residual blocks and uses multiple smaller convolution operations to improve the ability of model to predict the deformation field, while reducing the parameters of the model and decreasing the amount of computation. We performed experimental verification on the simulated geometric phantom data, voxel phantom data and clinical data, which shows that the proposed approach can effectively reduce the respiratory motion artifacts in PET images and improve the image quality.

References

  1. Siegel R L, Miller K D, Goding Sauer A, et al, "Colorectal cancer statistics, 2020," CA: a cancer journal for clinicians, vol. 70, no. 3, pp. 145-164, 2020. https://doi.org/10.3322/caac.21601
  2. Cherry S R, Dahlbom M, "PET: physics, instrumentation, and scanners," in PET, New York, NY: Springer, 2006, pp. 1-117.
  3. He J, O'Keefe G J, Gong S J, et al, "A novel method for respiratory motion gated with geometric sensitivity of the scanner in 3D PET," IEEE transactions on nuclear science, vol. 55, no. 5, pp. 2557-2565, 2008. https://doi.org/10.1109/TNS.2008.2001187
  4. Florian Buther, Iris Ernst, Mohammad Dawood, Peter Kraxner, Michael Schafers, Otmar Schober, Klaus P. Schafers, "Detection of respiratory tumour motion using intrinsic list modedriven gating in positron emission tomography," European Journal of Nuclear Medicine and Molecular Imaging, vol. 37, pp. 2315-2327, 2010. https://doi.org/10.1007/s00259-010-1533-y
  5. LeCun Y, Bengio Y, Hinton G, "Deep learning," nature, vol. 521, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
  6. He K, Zhang X, Ren S, et al, "Deep residual learning for image recognition," in Proc. of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
  7. Nabil Ibtehaz, M. Sohel Rahman, "MultiResUNet: Rethinking the U-Net architecture for multimodal biomedical image segmentation," Neural Networks, vol. 121, pp. 74-87, 2020. https://doi.org/10.1016/j.neunet.2019.08.025
  8. Jaderberg M, Simonyan K, Zisserman A, et al, "Spatial transformer networks," Advances in neural information processing systems, vol. 28, pp. 2017-2025, 2015.
  9. Wu Guorong, Kim Minjeong, Wang Qian, Munsell Brent C, Shen Dinggang, "Scalable HighPerformance Image Registration Framework by Unsupervised Deep Feature Representations Learning," IEEE transactions on biomedical engineering, vol. 63, no. 7, pp. 1505-1516, 2016. https://doi.org/10.1109/TBME.2015.2496253
  10. Sokooti H, De Vos B, Berendsen F, et al, "Nonrigid image registration using multi-scale 3D convolutional neural networks," in Proc. of International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 232-239, 2017.
  11. de Vos B D, Berendsen F F, Viergever M A, et al, "End-to-end unsupervised deformable image registration with a convolutional neural network," in Proc. of International Workshop on Deep learning in medical image analysis and multimodal learning for clinical decision support, pp. 204-212, 2017.
  12. Li H, Fan Y, "Non-rigid image registration using fully convolutional networks with deep self-supervision," arXiv preprint arXiv:1709.00799, 2017.
  13. Balakrishnan G, Zhao A, Sabuncu M R, et al, "An unsupervised learning model for deformable medical image registration," in Proc. of the IEEE conference on computer vision and pattern recognition, pp. 9252-9260, 2018.
  14. Avants B B, Tustison N J, Song G, et al, "A reproducible evaluation of ANTs similarity metric performance in brain image registration," Neuroimage, vol. 54, no. 3, pp. 2033-2044, 2011. https://doi.org/10.1016/j.neuroimage.2010.09.025
  15. Kuang D, Schmah T, "Faim-a convnet method for unsupervised 3d medical image registration," in Proc. of International Workshop on Machine Learning in Medical Imaging, Springer, Cham, pp. 646-654, 2019.
  16. Hu X, Kang M, Huang W, et al, "Dual-stream pyramid registration network," in Proc. of International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, Cham, pp. 382-390, 2019.
  17. M. Faisal Beg, Michael I. Miller, Alain Trouve, Laurent Younes, "Computing Large Deformation Metric Mappings via Geodesic Flows of Diffeomorphisms," International Journal of Computer Vision, vol. 61, no. 2, pp. 139-157, 2005. https://doi.org/10.1023/B:VISI.0000043755.93987.aa
  18. Isola P, Zhu J Y, Zhou T, et al, "Image-to-image translation with conditional adversarial networks," in Proc. of the IEEE conference on computer vision and pattern recognition, pp. 1125-1134, 2017.
  19. Ronneberger O, Fischer P, Brox T, "U-net: Convolutional networks for biomedical image segmentation," in Proc. of International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, 2015.
  20. Szegedy C, Liu W, Jia Y, et al, "Going deeper with convolutions," in Proc. of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
  21. Luo L, Xiong Y, Liu Y, et al, "Adaptive gradient methods with dynamic bound of learning rate," arXiv preprint arXiv:1902.09843, 2019.
  22. Dice L R, "Measures of the amount of ecologic association between species," Ecology, vol. 26, no. 3, pp. 297-302, 1945. https://doi.org/10.2307/1932409
  23. Rahman M A, Wang Y, "Optimizing intersection-over-union in deep neural networks for image segmentation," in Proc. of International symposium on visual computing, pp. 234-244, 2016.
  24. Blumberg H, "Hausdorff's Grundzuge der Mengenlehre," Bulletin of the American Mathematical Society, vol. 27, no. 3, pp. 116-129, 1920. https://doi.org/10.1090/S0002-9904-1920-03378-1
  25. Jan S, Santin G, Strul D, et al, "GATE: a simulation toolkit for PET and SPECT," Physics in Medicine & Biology, vol. 49, no. 19, p. 4543, 2004. https://doi.org/10.1088/0031-9155/49/19/007
  26. Segars W P, "Development and application of the new dynamic NURBS-based cardiac-torso (NCAT) phantom," The University of North Carolina at Chapel Hill, 2001.
  27. Fuster B M, Falcon C, Tsoumpas C, et al, "Integration of advanced 3D SPECT modeling into the open-source STIR framework," Medical physics, vol. 40, no. 9, 2013.