Proceedings of the Korean Information Science Society Conference (한국정보과학회:학술대회논문집)
- 2005.07b
- /
- Pages.226-228
- /
- 2005
- /
- 1598-5164(pISSN)
Sequence driven features for prediction of subcellular localization of proteins
단백질의 세포내 소 기관별 분포 예측을 위한 서열 기반의 특징 추출 방법
- Kim, Jong-Kyoung (Department of Computer Science, Pohang University of Science and Technology) ;
- Choi, Seung-Jin (Department of Computer Science, Pohang University of Science and Technology)
- Published : 2005.07.01
Abstract
Predicting the cellular location of an unknown protein gives valuable information for inferring the possible function of the protein. For more accurate Prediction system, we need a good feature extraction method that transforms the raw sequence data into the numerical feature vector, minimizing information loss. In this paper we propose new methods of extracting underlying features only from the sequence data by computing pairwise sequence alignment scores. In addition, we use composition based features to improve prediction accuracy. To construct an SVM ensemble from separately trained SVM classifiers, we propose specificity based weighted majority voting . The overall prediction accuracy evaluated by the 5-fold cross-validation reached
Keywords