DNA Sequence Classification Using a Generalized Regression Neural Network and Random Generator

난수발생기와 일반화된 회귀 신경망을 이용한 DNA 서열 분류

  • Published : 2004.07.01

Abstract

A classifier was constructed by using a generalized regression neural network (GRU) and random generator (RG), which was applied to classify DNA sequences. Three data sets evaluated are eukaryotic and prokaryotic sequences (Data-I), eukaryotic sequences (Data-II), and prokaryotic sequences (Data-III). For each data set, the classifier performance was examined in terms of the total classification sensitivity (TCS), individual classification sensitivity (ICS), total prediction accuracy (TPA), and individual prediction accuracy (IPA). For a given spread, the RG played a role of generating a number of sets of spreads for gaussian functions in the pattern layer Compared to the GRNN, the RG-GRNN significantly improved the TCS by more than 50%, 60%, and 40% for Data-I, Data-II, and Data-III, respectively. The RG-GRNN also demonstrated improved TPA for all data types. In conclusion, the proposed RG-GRNN can effectively be used to classify a large, multivariable promoter sequences.

Keywords

References

  1. M. V. Gils, H. Jansen, K. Nieminen, R. Summers, P. R. Weller, 'Using artificial neural network for classifying ICU patient states,' IEEE EMB Mag., pp. 41-47, 1997 https://doi.org/10.1109/51.637116
  2. S. Knudsen, 'Promoter 2.0: for the recognition of Pol Ⅱ promoter sequences,' Bioinformatics, vol. 15, pp. 356-361, 1999 https://doi.org/10.1093/bioinformatics/15.5.356
  3. S. Matis, Y. Xu, M. Shah, S. Guan, J. R. Einstein, R. Mural, E. Uberhaher, 'Detection of RNA polymerase Ⅱ promoters and polyadenylation sites in human DNA sequence, Comp. Chem., pp. 135-140, 1996 https://doi.org/10.1016/S0097-8485(96)80015-5
  4. D. E. Rummelhart, J. L. MaClelland, Parallel Distributed Processing, MIT Press, Cambridge, 1986
  5. B. Kim and Gary S. May, 'An optimal neural network process model for plasma etching,' IEEE Trans. Semicond. Manufact., vol. 7, no. 1, pp. 12-21, 1994 https://doi.org/10.1109/66.286829
  6. B. Kim and S. Park, 'An optimal neural network plasma model: a case study,' Chemom. Intell. Lab. Syst., vol. 56, pp. 39-50, 2001 https://doi.org/10.1016/S0169-7439(01)00107-1
  7. Specht, D. F, 'A generalized regression neural networks,' IEEE Trans. Neural Network vol. 2, pp. 568-576, 1991 https://doi.org/10.1109/72.97934
  8. B. Kim and S. Park, 'Modeling of process plasma using a radial basis function network: a case study,' Trans. Contr. Autom. Syst. Eng., vol.2, no. 4, pp. 268-273, 2000
  9. http://signal.salk.edu/cgi-bin/tdnaexpress
  10. http://arabidopsis.org
  11. http://www.ncbi.nlm.nih.gov
  12. F. R. Blattner, G.Ⅲ Pluket, C. A. Bloc, N. T. Perna, V. Burland, M. Riley, J. Colladovides, J. D. Glasner, C. K. Rode, G. F. Mayhew, J. Gregor, N. W. Davis, H. A. Goeden, D. J. Rose, B. Mau, Y. Shao, 'The complete genome sequence of Escherichia coli K-12,' Science 277, vol. 5331, pp. 1453-1474, 1997 https://doi.org/10.1126/science.277.5331.1453
  13. A. L. Delcher, D. Harmon, S. Kasif, O. White, S. L. Salzberg, 'Improved microbial gene identification with GLIMMER,' Nucleic Acids Res., vol. 27, pp. 4636-4641, 1999 https://doi.org/10.1093/nar/27.23.4636