DOI QR코드

DOI QR Code

Bio-marker Detector and Parkinson's disease diagnosis Approach based on Samples Balanced Genetic Algorithm and Extreme Learning Machine

균형 표본 유전 알고리즘과 극한 기계학습에 기반한 바이오표지자 검출기와 파킨슨 병 진단 접근법

  • Sachnev, Vasily (School of Information, Communication and Electronics Engineering, Catholic University) ;
  • Suresh, Sundaram (School of Computer Science and Engineering, Nanyang Technological University) ;
  • Choi, YongSoo (Division of Liberal Arts & Teaching, Sungkyul University)
  • Received : 2016.12.10
  • Accepted : 2016.12.30
  • Published : 2016.12.31

Abstract

A novel Samples Balanced Genetic Algorithm combined with Extreme Learning Machine (SBGA-ELM) for Parkinson's Disease diagnosis and detecting bio-markers is presented in this paper. Proposed approach uses genes' expression data of 22,283 genes from open source ParkDB data base for accurate PD diagnosis and detecting bio-markers. Proposed SBGA-ELM includes two major steps: feature (genes) selection and classification. Feature selection procedure is based on proposed Samples Balanced Genetic Algorithm designed specifically for genes expression data from ParkDB. Proposed SBGA searches a robust subset of genes among 22,283 genes available in ParkDB for further analysis. In the "classification" step chosen set of genes is used to train an Extreme Learning Machine (ELM) classifier for an accurate PD diagnosis. Discovered robust subset of genes creates ELM classifier with stable generalization performance for PD diagnosis. In this research the robust subset of genes is also used to discover 24 bio-markers probably responsible for Parkinson's Disease. Discovered robust subset of genes was verified by using existing PD diagnosis approaches such as SVM and PBL-McRBFN. Both tested methods caused maximum generalization performance.

본 논문에서는 파킨슨 병 진단 및 바이오 표지자 검출을 위한 극한 기계학습을 결합하는 새로운 균형 표본 유전 알고리즘(SBGA-ELM)을 제안하였다. 접근법은 정확한 파킨슨 병 진단 및 바이오 표지자 검출을 위해 공개 파킨슨 병 데이터베이스로부터 22,283개의 유전자의 발현 데이터를 사용하며 다음의 두 가지 주요 단계를 포함하였다 : 1. 특징(유전자) 선택과 2. 분류단계이다. 특징 선택 단계에서는 제안된 균형 표본 유전 알고리즘에 기반하고 파킨스병 데이터베이스(ParkDB)의 유전자 발현 데이터를 위해 고안되었다. 제안된 제안 된 SBGA는 추가적 분석을 위해 ParkDB에서 활용 가능한 22,283개의 유전자 중에서 강인한 서브셋을 찾는다. 특징분류 단계에서는 정확한 파킨슨 병 진단을 위해 선택된 유전자 세트가 극한 기계학습의 훈련에 사용된다. 발견 된 강인한 유전자 서브세트는 안정된 일반화 성능으로 파킨슨 병 진단을 할 수 있는 ELM 분류기를 생성하게 된다. 제안된 연구에서 강인한 유전자 서브셋은 파킨슨병을 관장할 것으로 예측되는 24개의 바이오 표지자를 발견하는 데도 사용된다. 논문을 통해 발견된 강인 유전자 하위 집합은 SVM이나 PBL-McRBFN과 같은 기존의 파킨슨 병 진단 방법들을 통해 검증되었다. 실시된 두 가지 방법(SVM과 PBL-McRBFN)에 대해 모두 최대 일반화 성능을 나타내었다.

Keywords

References

  1. M. Little, P. McSharry, E. Hunter, J. Spielman, and L. Ramig. "Suitability of dysphonia measurements for telemonitoring of Parkinson's disease." IEEE Transactions on Biomedical Engineering, vol. 56, pp. 1015-1022, 2009 https://doi.org/10.1109/TBME.2008.2005954
  2. M. F. Caglar, B. Cetisli, and I. B. Toprak. "Automatic recognition of Parkinson's disease from sustained phonation tests using ANN and adaptive neuro-fuzzy classifier". Journal of Engineering Science and Design, vol. 1, pp. 5964, 2010
  3. C. Sakar and O. Kursun. "Telediagnosis of Parkinson's disease using measurements of dysphonia". Journal of Medical Systems, vol. 34, pp. 591 - 599, 2010. https://doi.org/10.1007/s10916-009-9272-y
  4. R. Das. "A comparison of multiple classification methods for diagnosis of Parkinson disease". Expert Systems with Applications, vol. 37, pp 1568 - 1572, 2010. https://doi.org/10.1016/j.eswa.2009.06.040
  5. G. Sateesh Babu, S. Suresh, Uma Sangumathi and H.J. Kim. "A projection based learning meta-cognitive RBF network classifier for effective diagnosis of Parkinson's disease". Advances in Neural Networks ISNN 2012. Lecture Notes in Computer Science, vol. 7368, pp. 611 - 620, 2012.
  6. M. Engin, S. Demirag, E.Z. Engin, G. Celebi, F. Ersan, E. Asena, Z. Colakoglu. "The classification of human tremor signals using artificial neural network." Expert Systems with Applications, vol. 33, pp 754761, 2007.
  7. S. Pan, S. Iplikci, K. Warwick, and T. Z. Aziz. "Parkinson's Disease tremor classification: A comparison between support vector machines and neural networks". Expert Systems with Applications, vol. 39, pp. 10764 - 10771, 2012 https://doi.org/10.1016/j.eswa.2012.02.189
  8. M.N. Tahir and H.H Manap. "Parkinson disease gait classification based on machine learning approach". Journal of Applied Sciences, vol. 12, pp. 180 - 185, 2012. https://doi.org/10.3923/jas.2012.180.185
  9. C. R. Scherzer, A.C. Eklund, L.J. Morse, Z. Liao,J. J. Locascio,D. Fefer, M. A. Schwarzschild, M. G. Schlossmacher , M. A. Hauser, J. M. Vance, L. R. Sudarsky, D. G. Standaert, J. H. Growdon, R. V. Jensen, and S. R. Gullans. "Molecular markers of early Parkinsons disease based on gene expression in blood". Proceedings of the National Academy of Sciences, vol. 104, pp. 955 - 960, 2007 https://doi.org/10.1073/pnas.0610204104
  10. C. Taccioli, V. Maselli, J. Tegner, D. Gomez-Cabrero, G. Altobelli, W, Emmett, F. Lescai, S. Gustincich, and E. Stupka. "ParkDB: A Parkinsons disease gene expression database". http://database.oxfordjournals.org/content/2011/bar007, 2011.
  11. G. Sateesh Babu, S. Suresh, B. S. Mahanand, "A novel PBL-McRBFN-RFE approach for identification of critical brain regions responsible for Parkinsons disease", Expert System with Applications, vol. 41 no. 2, pp. 478-488, 2014. https://doi.org/10.1016/j.eswa.2013.07.073
  12. G. Sateesh Babu, S. Suresh, B. S. Mahanand, " Parkinsons disease prediction using gene expression A projection based learning meta-cognitive neural classifier approach", Expert System with Applications, vol. 40, no. 5, pp. 1519-1529, 2013. https://doi.org/10.1016/j.eswa.2012.08.070
  13. S. Saraswathi, S. Suresh, N. Sundararajan, M. Zimmermann and M. Nilsen-Hamilton, "ICGA-PSO-ELM Approach for Accurate Multiclass Cancer Classification Resulting in Reduced Gene Sets in Which Genes Encoding Secreted Proteins Are Highly Represented", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, pp. 452 - 463, 2011. https://doi.org/10.1109/TCBB.2010.13
  14. G.-B. Huang, Q. Y. Zhu, and C. K. Siew, "Extreme learning machine: theory and applications", Neurocomputing, vol. 70, no. 1-3, pp. 985990, 2006.
  15. G. K. Smyth. "Linear models and empirical Bayes methods for assessing differential expression in microarray experiments". Statistical Applications in Genetics and Molecular Biology, Article 3, 2004.
  16. S. Suresh, S. N. Omkar, V. Mani, T. N. G. Prakash, "Lift coefficient prediction at high angle of attack using recurrent neural network", Aerospace Science and Technology, vol. 7, pp. 595 602, 2003 https://doi.org/10.1016/S1270-9638(03)00053-1
  17. L. V. Ma, S. H. Park, J. H. Jang and J. H. Park, "Fuzzy Decision Making-based Recommendation Channel System using the Social Network Database, " J. of Digital Contents Society, Vol.17, No.5, 2016