I. INTRODUCTION
Brain metastases most frequently originate from lung cancer, breast cancer, and malignant melanoma [1]. Owing to late detection and incorrect diagnosis, most cases have extremely limited treatment options [2]. Every year, 14, 000 patients die because of malignant tumors [3]. According to the World Health Organization, tumors are divided into different grades, and the criterion for this division is tumor size [4], [5]. Although surgery for tumor removal is a common treatment approach, radiation and chemotherapy can be used to slow down the growth of tumors; unfortunately, it is impossible to remove cancer physically using radiation and chemotherapy [6]. Doctors or physicians typically use computed tomography or magnetic resonance imaging (MRI) to detect cancer. MRI is a standard scanning technology that provides detailed images of the brain, and it is the most common modality for diagnosing brain diseases [6], [7]. In addition, determining the number, size, and location of lesions in the brain is important for selecting the most appropriate treatment methods for patients [8].
To find extremely small lesions, many researchers have attempted to develop high-performance deep learning algorithms. Wu et al. [9] used an axial MR brain image in their study, wherein gray-level MR brain images were transformed into RGB color images that emphasized a brain tumor. Color segmentation images of highlighted lesions, obtained from the transformed color MRI image data, were utilized in CIELab to identify features accurately.
Akram et al. [10] proposed a brain tumor segmentation and detection method. In this work, input MRI brain images were pre-processed to remove noise using a median filter, resulting in sharper images. This method masked images and applied a windowing technique to detect tumors. These processes reduced false segmented pixels and helped algorithms to detect tumors more effectively.
Dong et al. [11] used a U-Net-based deep convolutional network to develop a fully automatic brain tumor detection and segmentation method. The performance of their developed model was on par with that of the recent challenger model of multimodal brain tumor image segmentation. In this study, brain tumor detection and segmentation algorithms were able to detect lesions; however, U-Net was found to be slow in clinical tasks.
Dahab et al. [12] segmented brain tumors to detect lesions. They used probabilistic neural networks (PNNs) for model learning. The proposed system exhibited a 100% accuracy in the PNN learning performance; however, this accuracy had a limited qualification. In addition, they reduced the processing time of the learning vector quantization-based PNN system compared to the conventional PNN.
In recent years, most researchers have progressed to segmentation. In studies using segmentation, different preprocessed anatomical slices of brain images in the sagittal plane are considered for detection. Brain segmentation performs better than detection, but the segmentation process is slower. In clinical tasks, AI helps doctors to detect and locate tumors. Processing speed is an essential factor in analyzing and finding tumors in many patients. Object detection presents an advantage in finding object locations faster, and its pre-processing is more efficient than segmentation. The most crucial aspect of this technique is locating the tumor. Finding an extremely small tumor is difficult, and sometimes doctors can miss the tumor’s existence. AI detection can help in avoiding such an oversight.
In this study, we compared three different pre-processed MR images to measure the AI performance. We applied the general histogram equalization (GHE) and contrast-limited adaptive histogram equalization (CLAHE) techniques. These techniques enhanced the contrast and revised the MR images such that lesions could be more easily located. To increase the model performance, pre-processing was performed before training the AI; this was a significant step in the process.
II. METHOD
2.1. Data Acquisition
Brain tumor MR images were collected from Seoul National University Hospital, Seoul, Republic of Korea, in 2016. A total of 11, 200 MR images were collected from 64 patients. These MR images consisted of the sagittal plane; among all the images collected, 7, 875 MR images from 45 patients were used as training data and 3, 325 MR images from 19 patients were used as test data to validate performance. For this data set, experts annotated the regions of interest (ROIs) for the lesions using ImageJ software (NIH, Bethesda, MD, USA). The ROIs were indicated through free-hand drawing, and the ROI information was considered the ground truth for the training and testing of the model.
2.2. Development Environment
In this study, we utilized Python 3.6 to pre-process the data. For the training of deep learning algorithms, we used a single NVIDIA GeForce RTX 2080Ti GPU (NVIDIA, Santa Clara, CA, USA), Tensorflow 2.0.0, Keras 2.3.1 with Tensorflow Backend, and OpenCV (Intel, Santa Clara, CA, USA).
2.3. Pre-processing
Fig. 1 presents the overall process of pre-processing and deep learning process. For deep learning training, a convolutional neural network requires identical vertical and horizontal sizes. To train the data, we rescaled the collected images to 512 px vertically and horizontally. MR images consist of 12-bit digital imaging and communications in medicine (DICOM) files. We converted the DICOM extension files into 8-bit JPG files. The data had varying window levels and widths; therefore, we had to mediate the data when converting the DICOM images into JPG images. After conversion, the window level was 1100 and the width was 1500. Our radiologist adjusted the window level and width to make the MR images clearer. To enhance and compare the training performances, we applied GHE and CLAHE. GHE enhances the contrast of images and flattens the density distribution [13], and CLAHE divides the image into block tiles and applies histogram equalization [14]. Thus, CLAHE can be emphasized more accurately. These two histogram equalization techniques were used to emphasize the lesions in this study. The resulting contrast enhanced images improved the model performance. We set the CLAHE clip limit to 2.0 and tile grid size to 8 x 8. In addition, we measured the maximum and minimum values of the x and y coordinates in the ROI. Then, we converted the free-hand drawings into the box-shaped ROIs. Fig. 2 shows the converted MR images. Fig. 3 demonstrates the histogram of the MR images.
Fig. 1. Functional block diagram of the overall process.
Fig. 2. Example of training images: (a) Original 8-bit JPG image, (b) CLAHE image, c) GHE image.
Fig. 3. Histogram of training images: (a) Original image, (b) CLAHE image, (c) GHE image.
2.4. ResNet-Based Convolutional Network
In this study, we used RetinaNet for learning and testing the detection of brain tumors. RetinaNet is well known for deep learning detection algorithms with a good training speed. RetinaNet utilizes ResNet 152 as a backbone; we used ResNet 152 for more specific learning. Lin et al. [15] determined the following:
RetinaNet is a single, unified network composed of a backbone network and two task-specific subnetworks. The backbone is responsible for computing a convolutional feature map over an entire input image and is an off-the-self convolutional network. The first subnet performs convolutional object classification on the backbone’s output; the second subnet performs convolutional bounding box regression (Fig. 4).
Fig. 4. Structure of the RetinaNet.
Moreover, the performance of RetinaNet is as fast as that of a one-stage network. The one-stage detector has a class imbalance problem; however, a new function has been suggested to resolve this problem. In our study environment, we employed eight batch sizes, 100 epochs, a learning rate of 0.00001, and an image size of 512 × 512. Additionally, transfer learning from ImageNet was applied to the model.
2.5. Evaluation
In this study, we verified the model performance using a confusion matrix for the three types of pre-processed images. The confusion matrix is a method for comparing the predicted values with the actual values. This technique is used to calculate the sensitivity, precision, and false positives per image. We verified the model detection performance using the free-response receiver operating characteristic (FROC) curve. To illustrate the FROC curves, we calculated the sensitivity and false positives per image; for the sensitivity, we used a 95% confidence interval.
III. RESULT
In the proposed method, we verified the model performance using 3, 325 MR images (the test set) from 19 patients. We compared the original images with those preprocessed through GHE and CLAHE. Fig. 5 illustrates the detection result of three different kinds MR images. The original image in the Fig. 5, model failed to find the location of the tumor. However, model found the tumor on the GHE and CLAHE MR images. GHE and CLAHE increased the contrast of the images, emphasizing the area of the lesions.
Fig. 5. Detection result: (a) Original, (b) CLAHE, (c) GHE.
The measurement of model performance involved evaluation of the sensitivity, false positives per image, and precision. To calculate these values, we obtained a confusion matrix. In the confusion matrix, a true positive suggested that the model detected a tumor correctly, a false positive indicated that the model incorrectly detected normal tissue as a tumor, and a false negative indicated that the model failed to find a tumor. The sensitivity, false positives per image, and precision were calculated using equations (1) to (3). The calculated values are presented in Table 1.
\(\text { Sensitivity }=\frac{T P}{T P+F N} \times 100\) (1)
\(\text { False positives per image }=\frac{F P}{\text { Images }}\) (2)
\(\text { Precision }=\frac{T P}{T P+F P} \times 100\) (3)
Table 1. Calculated values of the sensitivity, precision, and false positives per image.
In the original test set data, compared to the preprocessed images, the normal sensitivity was 80.06%, precision was 95.85%, and false positives per image were 0.038. The GHE result revealed a sensitivity of 80.63%, precision of 94.58%, and false positive per image were 0.050. The CLAHE result indicated 81.79% sensitivity, 94.02% precision, and 0.057 false positives per image.
From these results, it can be concluded that using CLAHE pre-processing yields the best performance values among the different approaches. Furthermore, Fig. 6 presents a FROC curve to analyze the model performance. This FROC curve demonstrates three comparison data for variation in sensitivity with false positives per image.
Fig. 6. FROC graph for the original, CLAHE, and GHE images.
IV. DISCUSSION
In this study, we compared three types of pre-processed MR images to evaluate and improve model performance. In addition, we utilized transfer learning from ImageNet to enhance the learning effect. For the detection model, we chose RetinaNet with a two-stage network, which was as fast as a one-stage network. Brain tumor MR images present an inevitable class imbalance problem: the number of regular MR image slides exceeds that of the abnormal slides. To solve this problem, we used the focal loss function contained in RetinaNet. In the definition of focal loss, the importance of the easy example is reduced to focus on hard negatives [14]. Therefore, this model is a suitable alternative to segmentation for detection of brain tumors.
Detecting brain tumors is challenging because of the varying shapes and sizes of tumors. To provide a better learning solution to the tumor detection model, image preprocessing was a significant step. Pre-processing emphasized the lesions, which the model could then distinguish from the normal area. In gray-level images, histogram equalization techniques provided contrast enhancement, changing the intensity of similar pixels [15]. CLAHE exhibited the best model performance for detecting brain tumors. CLAHE is a type of histogram equalization that divides an image into block tiles and contrasts, thus limiting noise [16], [17]. This pre-processing technique is an improvement over histogram equalization, which reduces the noise. As a result, the RetinaNet training algorithms were easy to learn and demonstrated a high accuracy.
In this study, we compared three differently preprocessed images. The results revealed a small gap in the model performance. However, the AI found more lesions in the CLAHE pre-processed images than in the other preprocessed images. The limitation of this study was that only three types of pre-processed images were considered. Compared to other types of tumors, brain tumors have different patterns, sizes, and shapes. To increase the detection performance, certain pre-processing tasks must be implemented before detection, such as normalization and data augmentation. For example, the data augmentation process reduces the weight of the MR images, thus the improving the model learning speed and accuracy. Furthermore, there are some evolved models that exhibit good performance and speed. Utilizing recent models and well-processed training images can improve the accuracy of tumor detection. To find tumors accurately, further research must be conducted, and irregular patterns of extremely small tumors must be considered.
Moreover, there are mask regions with convolutional neural network features (Mask R-CNN) technology that we plan to use in future works. Mask R-CNN is the combination of Fast R-CNN and region proposal network. Mask R-CNN has a classification, localization branch, and mask branch [18]. Faster R-CNN is the model to use in object detection, but Mask R-CNN aims to use in image segmentation. Mask R-CNN improves the segmentation’s processing speed. It makes up the shortcoming of segmentation, and it would be helpful in future studies in finding tumors.
V. CONCLUSION
In conclusion this study was based on deep learning detection algorithms, and we proposed a method to increase the detection performance of such algorithms. In future studies, we will develop and improve detection algorithms for AI used in hospitals. We plan to research the detection of brain tumors using recently developed models and well processed data. Finally, we plan to gather more variable tumor MR images and develop a sub-decision system that supports doctors in finding extremely small tumors and irregular patterns of brain tumors.
Acknowledgements
This research was supported by the Ministry of Science and ICT (MSIT), Korea, under the Information Technology Research Center (ITRC) support program (IITP-2021- 2017-0-01630) supervised by the Institute for Information and Communications Technology Promotion (IITP) and by the Gachon Gil Medical Center (FRD2019-16-02).
References
- L. Nayak, E. Q. Lee, and P. Y. Wen, "Epidemiology of Brain Metastases," Curr Oncol Reports, vol. 14, no. 1, pp. 48-54, 2012. https://doi.org/10.1007/s11912-011-0203-y
- P. Ghosal, L. Nandanwar, S. Kanchan, A. Bhadra, J. Chakraborty, and D. Nandi, "Brain tumor classification using ResNet-101 based squeeze and excitation deep neural network," in Proceeding of IEEE Second International Conference on Advanced Computational and Communication Paradigms (ICACCP), pp. 1-6, Feb. 2019.
- S. Sarkar, A. Kumar, S. Chakraborty, S. Aich, J. S. Sim, and H. C. Kim, "A CNN based Approach for the Detection of Brain Tumor Using MRI Scans," Test Engineering and Management, 2020.
- Y. Liu, A. Carpenter, H. Yuan, Z. Zhou, M. Zalutsky, G. Vaidyanathan, H. Yan, and T. Vo-Dinh, "Gold nanostar as theranostic probe for brain tumor sensitive PET-optical imaging and image-guided specific photo-thermal therapy," AACR, 2016.
- J. Amin, M. Sharif, M. Yasmin, and S. L. Fernandes, "A distinctive approach in brain tumor detection and classification using MRI," Pattern Recognition Letters. pp. 118-127 2017.
- M. Havaei, A. Davy, D. Warde-Farley, A. Biard, A. Courville, Y. Bengio, C. Pal, P. Jodoin, and H. Larochelle, "Brain tumor segmentation with deep neural networks," Medical image analysis, vol. 35, pp. 18-31, 2017. https://doi.org/10.1016/j.media.2016.05.004
- P. M. Shakeel, T. E. E. Tobely, H. Al-Feel, G. Manogaran, and S. Baskar, "Neural network based brain tumor detection using wireless infrared imaging sensor," IEEE Access, vol. 7 pp. 5577-5588, 2019. https://doi.org/10.1109/ACCESS.2018.2883957
- L. Sunwoo, Y. J. Kim, S. H. Choi, K. G. Kim, J. H. Kang, Y. Kang, Y. J. Bae, R. E. Yoo, J. Kim, K. J. Lee, S. H. Lee, B. S. Choi, C. Jung, C. H. Sohn, J. H. Kim, "Computer-aided detection of brain metastasis on 3D MR imaging: Observer performance study," PLoS One, vol. 12, no. 6, 2017.
- M. N. Wu, C. C. Lin, and C. C. Chang, "Brain tumor detection using color-based k-means clustering segmentation," in Proceeding of IEEE Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing, vol. 2, pp. 245-250, 2017.
- M. U. Akram, A. Usman, "Computer aided system for brain tumor detection and segmentation," in Proceeding of IEEE International conference on Computer networks and information technology, pp. 299-302, 2011.
- H. Dong, G. Yang, F. Liu, Y. Mo, and Y. Guo, "Automatic brain tumor detection and segmentation using U-Net based fully convolutional networks," in Proceeding of annual conference on medical image understanding and analysis, Springer, Cham, pp. 506-517, 2017.
- D. A. Dahab, S. S. Ghoniemy, and G. M. Selim, "Automated brain tumor detection and identification using image processing and probabilistic neural network techniques," International journal of image processing and visual communication, vol. 1, no. 2, pp. 1-8, 2012.
- Y. T. Kim, "Contrast enhancement using brightness preserving bi-histogram equalization," IEEE transactions on Consumer Electronics, vol. 43 no. 1, pp. 1-8, 1997. https://doi.org/10.1109/30.580378
- T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," in Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
- A. J. Vyavahare, R. C. Thool, "Segmentation using region growing algorithm based on CLAHE for medical images," in Proceeding of Fourth International Conference on Advances in Recent Technologies in Communication and Computing, pp. 182-185, 2012.
- A. M. Reza, "Realization of the contrast limited adaptive histogram equalization (CLAHE) for real-time image enhancement," Journal of VLSI signal processing systems for signal, image and video technology, vol. 38, no. 1, pp. 35-44, 2004. https://doi.org/10.1023/B:VLSI.0000028532.53893.82
- R. kumar Rai, P. Gour, B. Singh, "Underwater image segmentation using clahe enhancement and thresholding," International Journal of Emerging Technology and Advanced Engineering, vol. 2, no. 1, pp. 118-123, 2012.
- He K, Gkioxari G, Dollar P, Girshick R., "Mask r-cnn," in Proceedings of the IEEE international conference on computer vision, pp. 2961-2969, 2017.