한국정보과학회:학술대회논문집 (Proceedings of the Korean Information Science Society Conference)
- 한국정보과학회 2005년도 한국컴퓨터종합학술대회 논문집 Vol.32 No.1 (B)
- /
- Pages.226-228
- /
- 2005
- /
- 1598-5164(pISSN)
단백질의 세포내 소 기관별 분포 예측을 위한 서열 기반의 특징 추출 방법
Sequence driven features for prediction of subcellular localization of proteins
- Kim, Jong-Kyoung (Department of Computer Science, Pohang University of Science and Technology) ;
- Choi, Seung-Jin (Department of Computer Science, Pohang University of Science and Technology)
- 발행 : 2005.07.01
초록
Predicting the cellular location of an unknown protein gives valuable information for inferring the possible function of the protein. For more accurate Prediction system, we need a good feature extraction method that transforms the raw sequence data into the numerical feature vector, minimizing information loss. In this paper we propose new methods of extracting underlying features only from the sequence data by computing pairwise sequence alignment scores. In addition, we use composition based features to improve prediction accuracy. To construct an SVM ensemble from separately trained SVM classifiers, we propose specificity based weighted majority voting . The overall prediction accuracy evaluated by the 5-fold cross-validation reached
키워드