Browse > Article

Comparison of Sentence-level Event Classification Algorithms  

Yun, Bo-Hyun (목원대학교 컴퓨터교육과)
Abstract
Conventional classification researches have been worked mostly on the document classification. However, because several topics exist within the document, setence-level classification is needed in order to extract only the topics and events. This paper compares SVM(Support Vector Machine), BN(Bayesian Network), and DT(Decision Tree) algorithms and presents a sentence-level event classification method to classify the event sentence from the news including the timeline information. Features are noun, noun phrase, verb, and named entities; the indexing uses the boolean weighting method. The filtering utilizes the information gain method. In the experimental result, the performance of linear kernel function shows better than that of others in the SVM kernel functions. SVM shows the best accuracy compared to BN and DT algorithms.
Keywords
event classification; machine learning; SVM;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Naughton, M., Stokes, N., and Carthy, J., "Sentencelevel event classification in unstructured texts", Information Retrieval, Vol. 13, No. 2, pp. 132- 156, April 2010.
2 Sauri R., Knippen, R., Verhagen, M., and Pustejovsky, J. Evita, "A robust event recognizer for QA systems", Proceedings of the conference on HLT and empirical methods in natural language processing, pp. 700-707, Oct. 2005.
3 Allan, J., Gupta, R., and Khandelwal, V., "Temporal summaries of news topics", Proceedings of the 24th international ACM SIGIR conference on research and development in information retrieval, pp. 10-18, New York: ACM, Sep. 2001.
4 황명권, 최동진, 김판구, "의미적 텍스트 처리를 위한 주제별 문맥 정보 추출 방법", 한국정보기술학회 논문지, 제 8권, 제 11호, pp. 197-204, 2010년 11월.
5 http://www.reuters..com/
6 http://www.opencalais.com/
7 Sahami, M., Dumais, S., Heckerman, D., and Horvitz, E., "A Bayesian approach to filtering junk e-mail", Proceedings of the AAAI-98, pp. 1-8, July 1998.
8 Segal, R., Crawford, J., Kephart, J., and Leiba, B., Spamguru, "An enterprise anti-spam filtering system", Proceedings of the first conference on email and anti-spam, July 2004.
9 Aery, M., and Chakravarthy, S., email sift: "Mining-based approaches to email classification", Proceedings of the 27th international ACM SIGIR conference on research and development in information retrieval, pp. 580-581, July 2004.
10 Dredze, M., Lau, T., and Kushmerick, N., "Automatically classifying emails into activities", Proceedings of the 11th international conference on intelligent user interfaces, pp. 70-77, Jan. 2006.
11 Segal, R., and Kephart, J., "Incremental learning in swift file", Proceedings of the 17th international conference on machine learning, pp. 863-870, June 2000.
12 Hsu, C. -W., Chang, C. -C., and Lin, C. -J., "A practical guide to support vector classification", Technical Report, Department of Computer Science and Information Engineering, University of National Taiwan, July 2003.
13 Joachims, T., "Text categorization with support vector machines: Learning with many relevant features", In C. Nedellec & C. Rouveirol (Eds.), Proceedings of the 10th European conference on machine learning, Springer, Heidelberg, DE, Chemnitz, DE., pp. 137-142, April 1998.
14 김병주, "문서분류에서의 SVM 및 나이브베이지안, EM알고리즘의 특성비교", 대한전자공학회 하계종합학술대회, 제 32권, 제 1호, pp. 683- 684, 2009년 7월.
15 Lee, C. K., Y. G. Hwang, and S. J. Lim, et al., "Fine-Grained Named Entity Recognition Using Conditional Random Fields for Question Answering", Proc. AIRS-06, LNCS, Vol. 4182, pp. 581-587, July 2006.
16 Sokolova, M. and Lapalme, G., "A systematic analysis of performance measures for classification tasks", Inf. Process. Manage, Vol. 45, No. 4, pp. 427-437, May 2009.
17 http://www.cs.waikato.ac.nz/ml/weka/
18 박선, 김경준, 비음수 행렬 분해와 퍼지관계를 이용한 문서 군집, 한국항행학회 논문지, 제 14 권, 제 2호, pp. 239-246, 2010년 4월.