DOI QR코드

DOI QR Code

Modular neural network in prediction of protein function

단위 신경망을 이용한 단백질 기능 예측

  • 황두성 (단국대학교 컴퓨터과학과)
  • Published : 2006.02.01

Abstract

The prediction of protein function basically make use of a protein-protein interaction map based on the concept of guilt-by-association. The method however cannot determine the functions of proteins in case that the target protein does not interact with proteins with known functions directly. This paper studies protein function prediction considering the given problem as a K-class classification problem and proposes a predictive approach utilizing a modular neural network. The proposed method uses interaction data and protein related attributes as well. The experimental results demonstrate that the proposed approach can predict the functional roles of Yeast proteins whose interaction knowledge is not known and shows better performance than the graph-based models that use protein interaction data.

단백질의 기능 예측 모델은 guilt-by-association 개념을 바탕으로 단백질-단백질 상호작용 맵을 이용하고 있다. 이 방법은 목표 단백질이 기능이 알려진 단백질과 상호작용이 없는 경우 기능 예측이 불가능하다. 본 논문에서는 단백질 기능 예측 모델을 K-class 다중 분류 문제로 재 정의하고 단백질-단백질 상호작용 데이터 및 단백질의 알려진 속성 등을 학습 모델에 이용한 단위신경망의 설계와 응용을 제안한다. 제안하는 모델은 Yeast 단백질 데이터의 기능 예측에서 단백질-단백질 상호작용 데이터를 이용하는 방법에 비해 분류 예측율에서 우수한 성능을 보였으며 또한 상호작용이 밝혀지지 않은 단백질의 기능 예측을 할 수 있다.

Keywords

References

  1. B. Schwikowski et al., 'A network of protein-protein interactions in yeast,' Nature Biotechnology, Vol.18, No.3, pp.1257-1261, 2000 https://doi.org/10.1038/82360
  2. M. Fellenberg et al., 'Integrative Analysis of Protein Interaction Data,' Vol.8, Intelligent Systems for Molecular Biology, AAAI Press, pp.152-161, 2000
  3. H. Hishigaki et al., 'Assessment of prediction accuracy of protein function from protein-protein interaction data,' Yeast, Vol.18, pp.523-531, 2001 https://doi.org/10.1002/yea.706
  4. S. Oliver, 'Guilt-by-association goes global,' Nature, Vol.403, pp.601-603, 2000 https://doi.org/10.1038/35001165
  5. C. L. Tucker et al., 'Towards an understanding of complex protein networks,' TRENDS in cell biology, Vol.11, No.3, pp.l02-106, 2001 https://doi.org/10.1016/S0962-8924(00)01902-4
  6. J. Cheng et al, 'KDD Cup 2001 Report,' SIGKDD Exploration, Vol.3, No.2, pp.47-64, 2002 https://doi.org/10.1145/507515.507523
  7. T. Mitchell, Machine Learning, McGraw Hill, 1997
  8. M. Deng et al., 'Prediction of protein function using protein-protein interaction data,' Proceedings of the IEEE Computer Society Bioinformatics Conferences, 2002
  9. A. Vazquez, et al.,'Global protein function prediction in protein-protein interaction networks,' Nature Biotechnology, Vol.21, No.6, pp.697-700, 2003 https://doi.org/10.1038/nbt825
  10. Yonata Bilu and Michal Linial, 'The Advantage of Functional Prediction Based on Clustering of Yeat Genes and Its Correlation with Non-Sequence Based Classifications,' Journal of Computational Biology, Vol.9, No.2, pp.193-210, 2002 https://doi.org/10.1089/10665270252935412
  11. Xiangyun Wang et al., 'Automated data-driven discovery of motif-based function classifiers,' Information Science, Vol.155, pp.1-18, 2003 https://doi.org/10.1016/S0020-0255(03)00067-7
  12. Xinghus Lu et al., 'Automatic annotation of protein motif function with Gene Ontology terms,' BMC Bioinformatics, Vol.5, No.122, 2004 https://doi.org/10.1186/1471-2105-5-122
  13. T. Oyama et al., 'Extraction of knowledge on protein-protein interaction by association rule discovery,' Bioinformatics, Vol.18, No.5, pp.705-714, 2002 https://doi.org/10.1093/bioinformatics/18.5.705
  14. A. J. C. Sharkey, Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems, Springer, 1999
  15. S. Haykin, Neural Network: A Comprehensive Foundation, Prentice Hall, 1998
  16. N. Japkowicz, 'The Class Imbalanced Problem: Significance and Strategies,' Proceedings of the 2000 International Conference on Artificial Intelligence(IC-AI'2000), 2000
  17. N. V. Chawlar et al., 'SMOTE: Synthetic Minority Oversampling Techniques,' Journal of Artificial Intelligence Research, Vol.16, pp.321-357, 2002
  18. MIPS Yeast Database, http://mips.gsf.de/proj/yeast/
  19. John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis, Cambridge University Press, pp.47-82, 2004
  20. 황두성, 정재영, '단백질 기능 예측을 위한 그래프 기반 모델링,' 정보처리학회논문지 B, 제 12-B권, 제 2호, 2005 https://doi.org/10.3745/KIPSTB.2005.12B.2.209