1. INTRODUCTION
Prostate cancer (PCa) usually refers to prostate adenocarcinoma, where the term “adeno” and “carcinoma” means the gland and uncontrolled growth of cells, respectively. This condition only affects males. Typically, when there is a PCa it is considered as malignant (cancerous), which means that the tumor cells can metastasize, or invade and destroy surrounding tissues [1]. The prostate is a small gland that looks like a walnut and lies under the bladder and in front of the rectum. In general, the diagnosis and detection of PCa are carried out by a biopsy test and perform histological scoring using the Gleason grading system. It identifies the two most common cell patterns within a prostate tissue and assigns a score between one and five to both of them. A score of 1 represents normal cells (well differentiated), and a score of 5 represents abnormal cells (poorly differentiated) that barely resemble the normal prostate tissue [2]. The biopsy test is an invasive process and patients feela lot of uncomfortable. This clinical examination is performed based on the result of the prostate-specific antigen (PSA) test [3]. However, using the biopsy images, an automatic CAD technique can potentially help the pathologists in making the decision more accurately.
To guide the adjuvant (immunological agent) treatment decision, many researchers and pathologists, and radiologists are trying to examine the role of histopathological parameters as new biomarkers. However, long term experienced pathologist and radiologist is required for successful detection of biomarkers [4].
The whole slide biopsy image is analyzed by staining with hematoxylin and eosin (H&E) compounds. Tissue fixation, embedding, cutting, and staining vary significantly across laboratories and affect the appearance of histopathology samples. Therefore, to solve this problem, the stain normalization (SN) technique has been recently developed that can be useful for the preprocessing step to adjust intensity level and reduce random noise in the image [5].
In recent years, deep learning (DL) algorithms get a lot of attention to solve different types of problems in the medical imaging field. Also, many researchers around the world are using artificial neural network (ANN) techniques to analyze various medical images, such as histopathology, magnetic resonance (MR), radiology, ultrasound, computed tomography (CT), etc. [6].
In this paper, we introduce the MCCNN model for multiclass classification and predicting the histological grades in a prostate biopsy. The model is designed to be easy to implement, train, test, and use. This paper is organized as follows. We first introduce material and methods which include data collection, data pre-processing, and proposed network architecture with detailed explanation. We then present the experimental results, which show the performance of the MCCNN model. Finally, we conclude the paper with a brief description.
2. RELATED WORKS
Litjens et al. [7] examined a fully automated CAD system to detect prostate cancer in MRI. They detect initial candidates using multi-atlas-based prostate segmentation, voxel feature extraction, classification, and local maxima detection. A total of 347 (165 patients with cancer and 182 patients without cancer) patients with MR-guided biopsy were used to evaluate their automated system.
Kim et al. [8] used prostate cancer samples of benign, grade 3, grade 4, and grade 5 to perform texture analysis using the gray label co-occurrence matrix (GLCM) technique. The author extracted a total of 12 features and selected 10 best features using the ANOVA statistical test. Finally, they performed a machine learning classification using support vector machine (SVM) and k-nearest neighbor (k-NN) algorithms.
Turki et al. [9] performed machine classification for cancer detection. They used three types of sample data, such as colon cancer, liver cancer, and thyroid cancer. The machine learning algorithms they applied are AdaBoost, deep boost, XgBoost,
and support vector machine (SVM). A cross-validation technique was carried out for checking the generalizability of the model. The author concluded that among all the algorithms, SVM performed the best.
Chakraborty et al. [4] used a dual-channel residual convolution neural network (DCRCNN) for the detection of cancerous tissue in histopathological images of lymph node sections. The data samples primarily belonged to two different classes, namely benign and malignant. Stain separation (i.e., hematoxylin and eosin) was performed for binary DCRCNN classification.
Tai et al. [10] proposed a novel method for the classification of histopathology images according to the Gleason grading system. Their method analyzed the fractal dimension features of each subbands from the prostate biopsies. The SVM andleave-one-out were adopted as the classifier and to estimate error rate, respectively.
Doyle et al. [11] presented a CAD system for detecting prostate cancer from the digitized images of histological specimens. They extracted nearly 600 texture features to perform pixel-wise Bayesian classification at each image scale to obtain corresponding likelihood scenes. Their CAD system was tested on 22 studies by comparing the CAD result to a manual cancer segmentation that was carried out by a pathologist and achieved an overall accuracy of 88%.
3. MATERIALS AND METHODS
3.1 Data Collection
The histology images used for this research has been collected from the Severance Hospital of Yonsei University, and the biopsy test was performed on 10 patients. The histological grades (benign, grade 3, grade 4, and grade 5) observed in the prostate biopsy are used in this paper for DL classification. The histopathological slides were scanned at 40× optical magnification with 0.3 NA objective using a digital camera (Olympus C3000) attached to a microscope (Olympus BX-51). The collected images were of size 256 × 256 pixels and were further used to generate patches of size 128 × 128 pixels. The patch generating process wascarri ed out to increase the number of data samples for MCCNN classification, shown in Table 1.
Table 1. The arrangement of the dataset for multiclass classification
The biopsy tissue extracted from different patients was sectioned in and the deparaffinization and rehydration process was carried out to stain with H&E compounds using an automated staine (Leica Autostainer XL). Before staining, removal of paraffin wax is important to avoid poor staining of the tissue section.
3.2 Stain Normalization
Stain normalization is a great approach for adjusting the color intensity in histopathology images. Generally, to diagnose the tissue samples, colored chemical stains are used to identify and analyze different components under a microscope.
It has been analyzed that the constancy of color in histology sections is a problem for segmentation, texture analysis, and classification. Incubation time, temperature and the accurate amount of chemical compounds to be mixed are very important to maintain the quality of staining and for histological analysis [12]. Therefore, we used a method proposed by Reinhard et al. [13] for color normalization. The color distribution of a source image is matched to that of a target image using a set of linear transforms in a Lab color space. Fig. 1 shows the process of stain normalization and figure 2 shows the normalized histological images of four classes. Each channel was normalized based on statistics of the source and target images, and the computation process can be expressed as:
\(\operatorname{Norm} L_{\text {map }}=\left(\left(L_{\text {sre }}-\bar{L}_{\text {xrc }}\right) \times\left(\frac{\hat{L}_{\text {tar }}}{\hat{L}_{\text {sre }}}\right)\right)+\bar{L}_{\text {lar }}\) (1)
\(\operatorname{Norm} A_{\operatorname{map}}=\left(\left(A_{\text {are }}-\bar{A}_{\text {arc }}\right) \times\left(\frac{\hat{A}_{\text {tar }}}{\hat{A}_{\text {xre }}}\right)\right)+\bar{A}_{\text {tar }}\) (2)
\(\operatorname{Norm} B_{\text {map }}=\left(\left(B_{\text {sre }}-\bar{B}_{\text {xre }}\right) \times\left(\frac{\hat{B}_{\text {ar }}}{\hat{B}_{\text {xre }}}\right)\right)+\bar{B}_{\text {lar }}\) (3)
\(\text { Norm }_{\text {mep }}=\text { concatenate }\left(\operatorname{Norm} L_{\text {wup }} \text { Norm } A_{\text {mạ }} \text { Norm } B_{\text {mup }}\right)\) (4)
where \(\bar{L}, \bar{A},\) and \(\bar{B}\) are the channel means and \(\hat{L}, \hat{A},\) and \(\hat{B}\)are the channel standard deviation, \(\text { src }\) is the source image, \(\text { tar }\) is the target image, and \(\operatorname{Norm} L A B_{\operatorname{map}}\) is the normalized image, which was further converted to \(\operatorname{Norm} R G B_{\text {map }}\).
Fig. 1. The process of stain normalization. (a) Source image. (b) Target image. (c) Normalized images.
Fig. 2. Stain normalized images of the histological grades
3.3 Convolutional Neural Network
This paper introduces a MCCNN model for PCa classification. For medical images, the texture is an important characteristic that provides useful information by extracting the group of features containing positive and negative properties and visualizing the pixel distribution for pattern analysis. It has been proven that texture analysis using computational method perform better than human eyesight [14].
Convolutional neural network (CNN) based deep texture features are more effectual than handcrafted ones. In this paper, a 2D MCCNN model has been developed for the detection of cancerous and non-cancerous grades from microscopy biopsy
images. The model was built based on three different channels (Red, Green, and Blue), and the splitting channels from the RGB image are passed respectively, shown in Fig. 3.
Fig. 3. Multichannel CNN architecture.
The model is divided into four sections: input, convolution, concatenation, and classification. The Input section takes grayscale images of three different channels separated from the original RGB image and they are passed to the next section separately. In the convolution section, three convolutional blocks are presented with a different number of filters (92, 192, and 384). Kernel size used for convolutional block 1 and 2&3 is 5 × 5 and 3 × 3. Max pooling (22) is used after the second and third convolutional blocks to reduce the height and
width of the features map. It also reduces the computational cost by reducing the number of parameters to learn. ReLU activation function is applied after each convolutional layer, first and second dense layers. In the concatenation section, all the features maps from each channel are concatenated and passed through a global average pooling (GAP) layer to the classification section where three dense
layers of 64, 32, and 4 neurons are used for multiclass classification. Softmax classifier. To reduce overfitting, the GAP layer is used instead of flattening, which reduces the total number of parameters in the model. A dropout of 0.5 was applied after the second and third convolutional blocks and second dense layer, respectively. Fig. 3 shows the overall architecture of the developed model. ReLU
is used for not activating all the neurons in the network at the same time, and thus, the model weights and biases for some neurons are not updated during the backpropagation process. The equation for this activation function can be defined as,
\(A_{i, j, k}=\max \left(w_{n} I_{i, j}+b_{n}, 0\right)\) (5)
where \(A_{i, j, k}\) belongs to the activation value of the \(n^{t h}\) feature map at the location \((i, j), I_{i, j}\) is the output patch, \(\omega_n\) and \(b_n\) are the weight vector and bias term, respectively, of the \(n^{t h}\) filter. The model took 4 hours 30 minutes to train the
data samples and less than 1 minute to test.
The model was trained with an Intel corei7 processor, one NVIDIA GeForce RTX 2080 GPU, and 18GB RAM.
3.4 Model Optimization and Loss
To control the training process while building a DL model, the initial learning rate (LR) was set to 0.01 and Keras function called ReduceLROn Plateau (patience=10 and factor=0.8) was used to reduce the initial LR automatically by a factor of
0.8 if there is no improvement in validation loss while training for 10 consecutive epochs. Stochastic gradient descent (SGD) optimizer was used for optimizing the weight and biases of the networkThe loss functions are very important part of the
deep learning models, which are used to determine the changeability between the actual and prediction value. In general, the loss function of a model can be expressed as,
\(L(\theta)=\frac{1}{n} \sum_{i}^{n} L\left(y^{(i)}, f\left(x^{(i)}, \theta\right)\right)\) (6)
where \(\theta\) is the parameters of the model, \(x \) is the feature matrix, \(y\) is the actual value of a feature set In the present study, the loss function used for multiclass classification is categorical cross-entropy [15, 16, 17]. This loss function measures the classification model performance by predicting a probability value between 0 and 1. The equation used for computing the loss function can be expressed as,
\(\begin{equation} \text { categorical }=-\sum_{i=1}^{N} y_{i} \cdot \log \hat{y}_{i} \end{equation}\) (7)
where \(N \) is the number of an output class in the model, \(\begin{equation} y_{i} \end{equation}\) is the target value, and \(\begin{equation} \hat{y_{i}} \end{equation}\) is the true predicted score for each \(\begin{equation} \text { class }_{i} \end{equation}\) in \(N\).
Table 2. The architecture of 2D MCCNN
4. RESULTS AND DISCUSSION
The MCCNN model proposed in this paper provided promising results compared to our previous research results for multiclass classification. Python programming language was used for stain normalization and developing the model. In total, 6000 data samples were used for GP classification. Training, validation, and the testing split was performed in two steps: 1.) A ratio of 80% and 20% was applied on the total number of samples for splitting training and testing data, respectively. 2.) Further 25% was applied to the training set for splitting validation data. The performance metrics used for the classification problem in this paper are accuracy, precision, recall, f1-score. Fig. 4 shows the confusion matrix [18,19] generated using the confidence probability values for four classes, namely benign, grade 3, grade 4, and grade 5. Fig. 5 shows the learning graph of the model plotted using the accuracy and loss scores of training and validation.
Fig. 4. Confusion matrix for multiclass classification.
Fig. 5. Learning graph of MCCNN model.
The test data set was fed to the trained network to cross-check the performance of the proposed model. The outstanding results we obtained after testing are 95.1%, 94.8%, 95.1%, and 94.9% for accuracy, precision, recall, and f1-score, respectively. Table 3 shows the overall performance of the MCCNN model based on the unseen data.
Table 3. Performance evaluation of 2D MCCNN using test dataset
In this paper, we mainly focused on multiclass classification instead of binary. The multiclass classification is performed simultaneously and there are high chances of data misclassification as the error of one class can affect the results of the other classes, whereas, in binary classification, there are fewer chances of misclassification because the classification is performed separately and independently. In our previous study [20], we carried out a morphology analysis of the cell nucleus and lumen tissue components and extracted handcrafted features for the binary and multiclass classification using SVM and multi-layer perceptron (MLP). However, the performance was poor for multiclass classification, and the average accuracy obtained using SVM and MLP was 65.5% and 55.5%, respectively. On the other hand, the accuracy of 92.0% and 82.5 was obtained for the proposed binary classification.
The present results in this paper were obtained based on the computational features extracted using a proposed multichannel 2D CNN model. We performed the classification by restricting the primary hypothesis, 1.) The recall of benign and high cancer classes (grade 4 and grade 5) must be more than 95%. 2.) The recall of low cancer class (grade 3) must be more than 85%.
Grade 3 biopsy images are quite similar to benign and grade 4, therefore it is difficult to classify and achieve high accuracy compared to other classes. We can say that CNN based computational features are more effective than the handcrafted ones, and are useful for both binary and multiclass classification. The stain normalization technique was used for image preprocessing to equalize the color values in the image. Further, the normalized images were cropped and converted to grayscale by splitting the RGB channels for MCCNN classification.
The feature extraction from three different channels is performed separately using a customized model, and the extracted features from three different channels are concatenated to generate a single feature representation and subjected to the dense layers for the classification.
5. CONCLUSION
In this paper, a customized multichannel 2D CNN model was developed and experimented with using the histopathological images, to classify benign, grade 3, grade 4, and grade 5 classes. Stain normalization and RGB splitting were effective for multiclass classification. Different types of performance metrics were used to evaluate the CNN model, and according to the primary hypothesis mentioned in the previous section, the model demonstrated amazing results. As we know that multiclass classification is a challenging task for detecting PCa, we proposed MCCNN approach for predicting the histological grades accurately.
In conclusion, the proposed model showed promising results and we can say that this research is very encouraging. In the future, we will improve the model performance and recall percentage for low cancer class by modifying the customized model and using other algorithms for image preprocessing and model optimization.
References
- M.S. Chung, M. Shim, J.S. Cho, W. Bang, S.I. Kim, S.Y. Cho, et al., "Pathological Characteristics of Prostate Cancer in Men Aged< 50 Years Treated with Radical Prostatectomy: A Multi-centre Study in Korea," Journal of Korean Medical Science, Vol. 34, No. 10, pp. 1-10, 2019.
- D. Albashish, S. Sahran, A. Abdullah, A. Adam, and M. Alweshah, "A Hierarchical Classifier for Multiclass Prostate Histopathology Image Gleason Grading," Journal of Information and Communication Technology, Vol. 17, No. 2, pp. 323-346, 2018.
- S. Yoo, I. Gujrathi, M.A. Haider, and F. Khalvati, "Prostate Cancer Detection Using Deep Convolutional Neural Networks," Scientific Report Vol. 9, No. 1, pp. 1-10, 2019. https://doi.org/10.1038/s41598-018-37186-2
- S. Chakraborty, S. Aich, A. Kumar, S. Sarkar, J.S. Sim, and H.C. Kim, "Detection of Cancerous Tissue in Histopathological Images Using Dual-channel Residual Convolutional Neural Networks (DCRCNN)," Proceeding of International Conference on Advanced Communication Technology, pp. 197-202, 2020.
- A.M. Khan, N. Rajpoot, D. Treanor, and D. Magee, "A Nonlinear Mapping Approach to Stain Normalization in Digital Histopathology Images Using Image-specific Color Deconvolution," IEEE Transactions on Biomedical Engineering, Vol. 61, No. 6, pp. 1729-1738, 2014. https://doi.org/10.1109/TBME.2014.2303294
- M. Sinecen and M. Makinaci, "Classification of Prostate Cell Nuclei Using Artificial Neural Network Methods," Proceeding of International Enformatika Conference, pp. 170-172, 2005.
- G. Litjens, O. Debats, J. Barentsz, N. Karssemeijer, and H. Huisman, "Computer-aided Detection of Prostate Cancer in MRI," IEEE Transactions on Medical Imaging, Vol. 33, No. 5, pp. 1083-1092, 2014. https://doi.org/10.1109/TMI.2014.2303821
- C.H. Kim, J.H. So, H.G. Park, N. Madusanka, P. Deekshitha, and S. Bhattacharjee, "Analysis of Texture Features and Classifications for the Accurate Diagnosis of Prostate Cancer," Journal of Korea Multimedia Society, Vol. 22, No. 8, pp. 832-843, 2019. https://doi.org/10.9717/KMMS.2019.22.8.832
- T. Turki, "An Empirical Study of Machine Learning Algorithms for Cancer Identification," Proceeding of International Conference on Networking, Sensing and Control, pp. 1-5, 2018.
- S.K. Tai, C.Y. Li, Y.C. Wu, Y.J. Jan, and S.C Lin, "Classification of Prostatic Biopsy," Proceeding of International Conference on Digital Content, Multimedia Technology and Its Applications, pp. 354-358, 2010.
- S. Doyle, A. Madabhushi, M. Feldman, and J. Tomaszeweski, "A Boosting Cascade for Automated Detection of Prostate Cancer from Digitized Histology," Proceeding of International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 504-511, 2006.
- F. Ciompi, O. Geessink, B.E. Bejnordi, G.S. Souza, A. Baidoshvili, and G. Litjens, "The Importance of Stain Normalization in Colorectal Tissue Classification with Convolutional Networks," Proceeding of International Symposium on Biomedical Imaging, pp. 160-163, 2017.
- E. Reinhard, M. Adhikhmin, B. Gooch, and P. Shirley, "Color Transfer between Images," IEEE Computer Graphics and Applications, Vol. 21, No. 5, pp. 34-41, 2001. https://doi.org/10.1109/38.946629
- M. Lucas, I. Jansen, C.D. Savci-Heijink, S.L. Meijer, O.J. Boer, and T.G. Leeuwen, "Deep Learning for Automatic Gleason Pattern Classification for Grade Group Determination of Prostate Biopsies," Virchows Archiv, Vol. 475, No. 1, pp. 77-83, 2019. https://doi.org/10.1007/s00428-019-02577-x
- R.C. Martin and J.R. Pomerantz, "Visual Discrimination of Texture," Perception and Psychophysics, Vol. 24, No. 5, pp. 420-428, 1978. https://doi.org/10.3758/BF03199739
- F. Xing, Y.H. Su, F. Liu, and L. Yang, "Deep Learning in Microscopy Image Analysis: A Survey," IEEE Transaction on Neural Networks and Learning Systems, Vol. 29, No. 3, pp. 4550-4568, 2017.
- C. Angermueller, T. Parnamaa, L. Parts, and O. Stegle, "Deep Learning for Computational Biology," Molecular System Biology, Vol. 12, No. 7, pp. 1-17, 2016.
- A. Qayyum, S.M. Anwar, M. Awais, and M. Majid, "Medical Image Retrieval Using Deep Convolution Neural Network," Neurocomputing, Vol. 266, No. C, pp. 8-20, 2017. https://doi.org/10.1016/j.neucom.2017.05.025
- T.T. Tang, J.A. Zawaski, K.N. Francis, A.A. Qutub, and M.W. Gaber, "Image-based Classification of Tumor Type and Growth Rate Using Machine Learning: A Preclinical Study," Scientific Report, Vol. 9, No. 1, pp. 1-10, 2019. https://doi.org/10.1038/s41598-018-37186-2
- S. Bhattacharjee, H.G. Park, C.H. Kim, D. Prakash, N. Madusanka, and J.H. So, "Quantitative Analysis of Benign and Malignant Tumor in Histopathology: Predicting Prostate Cancer Grading Using SVM," Applied Sciences, Vol. 9, No. 15, pp. 1-18, 2019.
Cited by
- Multi-class Classification of Histopathology Images using Fine-Tuning Techniques of Transfer Learning vol.24, pp.7, 2020, https://doi.org/10.9717/kmms.2021.24.7.849