DOI QR코드

DOI QR Code

A Implementation of Optimal Multiple Classification System using Data Mining for Genome Analysis

  • Received : 2018.09.07
  • Accepted : 2018.12.12
  • Published : 2018.12.31

Abstract

In this paper, more efficient classification result could be obtained by applying the combination of the Hidden Markov Model and SVM Model to HMSV algorithm gene expression data which simulated the stochastic flow of gene data and clustering it. In this paper, we verified the HMSV algorithm that combines independently learned algorithms. To prove that this paper is superior to other papers, we tested the sensitivity and specificity of the most commonly used classification criteria. As a result, the K-means is 71% and the SOM is 68%. The proposed HMSV algorithm is 85%. These results are stable and high. It can be seen that this is better classified than using a general classification algorithm. The algorithm proposed in this paper is a stochastic modeling of the generation process of the characteristics included in the signal, and a good recognition rate can be obtained with a small amount of calculation, so it will be useful to study the relationship with diseases by showing fast and effective performance improvement with an algorithm that clusters nodes by simulating the stochastic flow of Gene Data through data mining of BigData.

Keywords

CPTSCQ_2018_v23n12_43_f0001.png 이미지

Fig. 1. The execution results of the HMSV algorithm proposedin this paper

CPTSCQ_2018_v23n12_43_f0002.png 이미지

Fig. 2. Sensitivity and specification of K-means, SOM, HMSV

References

  1. G.K., Yang, Y. H., T. p. Speed, " Statistical issues in microarray data analysis," Functional Genomics, Methods and Protocols, 24 ,111-136, 2003
  2. Y.Chen, E. R. Dougherty and M. L., " Bittner, Ratio-Based Decision and the Quantitative Analysis of cCNA Microarray Images," Journal of Biomedical Optics 2 no.4,364-374, 1997 https://doi.org/10.1117/12.281504
  3. Y. H. Yang, S. Dudiot, P. Luu, D. M. Lin, V. Peng, J. Nagi and T.P. Speed, "Normalization for cDNA Microarray data : a robust composite method addressing single and multiple slide systematic variation," Nucleic Acids Research no,2002.
  4. Pierre Baldlnd G. Wesley Hatfield, "DNA Microarrays and gene expression "(n.p.: Cambridge University Press, 2002)
  5. T. Kohonen, elf-Organizing Map (n.p.: Springer, 1997)
  6. Kim sul Lam, "Analysis of Influencing Factors of Medical Expenditure on Elderly Hypertension Outpatients - Focused on Region and Medical Use," (Master of Engineering Thesis, Chungbuk National University Graduate School, 8-9,2018.
  7. E. Berglund ; J. Sitte, "The parameterless self-organizing map algorithm," IEEE Transactions in Neural Networks 17 no.2 ,305-316,2006 https://doi.org/10.1109/TNN.2006.871720
  8. Smyth, G.K., Yang, Y.H., Speed, T.P, "Staticstics issues in microarray data analysis. Function Genomics," Methods and protocols 24,111-136, 2003
  9. Yang, Y.H., Dudoit, s., Luu,P., Lin, D.M., Peng, V., Nagi, J., Speed, T.P.(2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiplr slide systematic varation. Nucleic Acids Res 30.
  10. SukBuk Kang, oungMin Kim, JinKap Choi, BongSeon Kim, WonSub Yang. "Application Statistics." n.p.: Kyeongmunsa, 1993.
  11. Han hakyoung, "Introduction to pattern recognition"(n.p.: hanbit media, 2011)
  12. Cho sunho, "Segmented viterbi algorithm for speech recognition,",Master's Thesis, Korea University, n.d, 8-9.
  13. National Cancer Information Center. http://www.cancer.go.kr, 2017
  14. Lee Ji Sun, "Explanatory model on quality of life in patients with pancreatic cancer", doctor, Yonsei University Graduate School, 1-2, 2018
  15. Tao,L., C.Zhang and Mitsunori,O., " comparative study of feature selection and multiclass classfication methods for tissue classification based on gene expression," Bioinformatics 20 ,2429-2437.
  16. Hsu, Chih_Wei and Chih-Hen Lin, "comparison of methods for multi-class support vector machines," IEEE Transactions in Neural Networks 13,415-425,2002 https://doi.org/10.1109/72.991427