Browse > Article
http://dx.doi.org/10.9717/kmms.2020.24.1.106

Research on the Hybrid Paragraph Detection System Using Syntactic-Semantic Analysis  

Kang, Won Seog (Dept. of Computer Education, Andong National University)
Publication Information
Abstract
To increase the quality of the system in the subjective-type question grading and document classification, we need the paragraph detection. But it is not easy because it is accompanied by semantic analysis. Many researches on the paragraph detection solve the detection problem using the word based clustering method. However, the word based method can not use the order and dependency relation between words. This paper suggests the paragraph detection system using syntactic-semantic relation between words with the Korean syntactic-semantic analysis. This system is the hybrid system of word based, concept based, and syntactic-semantic tree based detection. The experiment result of the system shows it has the better result than the word based system. This system will be utilized in Korean subjective question grading and document classification.
Keywords
Paragraph Detection; Syntactic-Semantic Analysis; Clustering; Similarity;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 H. Kozima, "Text Segmentation based on Similarity between Words," Proceeding of 31st Annual Meeting on ACL, pp. 286-288, 1993.
2 K. Filippova and M. Strube, "Using Linguistically Motivated Features for Paragraph Boundary Identification," Proceeding of 2006 Conference on Empirical Methods in NLP, pp. 267-274, 2006.
3 E.G. Park, M.H. Cho, S.W. Kim, and D.R. Na, "A Method for Extracting Dependency Relations Using Chunking and Segmentation," Proceeding of KISS Conference in Language Engineering Research, Vol. 16, No. 1, pp. 131-137, 2004.
4 H.Y. Lee and Y.S. Lee, "Korean Syntactic Analysis by Using Clausal Segmentation of Embedded Clause," Journal of KISS: Software and Application, Vol. 35, No. 1, pp. 50-58, 2008.
5 C.H. Lee, M.G. Jang, and Y.H. Seo, "Improved Sentence Boundary Detection Method for Web Documents," Journal of KISS: Software and Application, Vol. 37, No. 6, pp. 455-463, 2010.
6 I.J. Lee and M.G. Kim, "Document Summarization Based on Sentence Clustering Using Graph Division," Journal of KISS B, Vol. 13-B, No. 2, pp. 149-154, 2006.
7 J.H. Choi and P.K. Kim, "Core Sentence Extraction for Expanding Knowledge Base," Proceeding of Conference on Korea Multimedia Society, pp. 446-449, 2010.
8 S.J. Park and J.H. Kim, "Paragraph-based KMeans Clustering by using Meaning-based Paragraph Division," Journal of Knowledge Information Technology and System(JKITS), Vol. 12, No. 1, pp. 157-164, 2017.   DOI
9 C. Yimin, J. Adachi, and A. Takasu, "Detection of Paragraph Boundaries in Complex Page Layouts for Electronic Documents," Proceeding of 74th Conference on JIPS, pp. 539-540, 2012.
10 Y.U. Park and H.C. Kwon, "A Study of Parsing System Implementation Using Segmentation and Argument Information," Journal of Korea Multimedia Society, Vol. 16, No. 3, pp. 366-374, 2013.   DOI
11 J.H. Park and S.H. Maeng, "A Method for Establishing Korean Multi-word Concept Boundary Harnessing Dictionaries and Sentence Segmentation for Constructing Concept Graph," Proceeding of Conference on KISS, pp. 651- 653, 2017.
12 J.H. Kim and J.Y. Seo, "Robust Method for Sentence Boundary Identification in Informal Documents," Proceeding of Conference on KISS, Vol. 37, No. 1C, pp. 266-270, 2010.
13 C. Lai, M. Farrus, and J.D. Moore, "Automatic Paragraph Segmentation with Lexical and Prosodic Features," Interspeech 2016, pp. 1034-1038, 2016.
14 H.S. Lim and G.H. Han, "Korean Sentence Boundary Detection Using Memory-based Machine Learning," Journal of KCA, Vol. 4, No. 4, pp. 133-139, 2004.
15 S. Kaufmann, "Cohesion and Collocation: Using Context Vectors in Text Segmentation," Proceeding of 37th Annual Meeting of ACL on Computational Lingistics, pp. 591-595, 1999.
16 J. Tiedemann and J. Mur, "Simple is Best: Experiments with Different Document Segmentation Strategies for Passage Retrieval," Coling 2008: Proceeding of 2nd Workshop on Information Retrieval for Question Answering (IR4QA), pp. 17-25, 2008.
17 W.G. Joo, J.S. Kim, and K.S. Choi, "Automatic Text Categorization Using Passage-based Weight Function and Passage Type," Journal of KIPS B, Vol. 12-B, No. 6, pp. 703-714, 2005.
18 K. Hashimoto, G. Kontonatsios, M. Miwa, and S. Ananiadou, "Topic detection using paragraph vectors to support active learning in systematic reviews," Journal of Biomedical Informatics, Vol. 62, pp. 59-65, 2016.   DOI
19 W.S. Kang, "Research on Subjective-type Grading System Using Syntactic-Semantic Tree Comparator," Journal of Korean Association of Computer Education, Vol. 21, No. 6, pp. 83-92, 2018.   DOI
20 J.H. Kim, C.N. Sun, S.W. Hong, S.W. Lee, J.Y. Seo, J.M. Cho, "KTAG99: Highly-Adaptable Korean POS Tagging System to New Environments," Proceeding of 11th Conference on Hangeul and Korean Information Processing, pp. 99-105, 1999.
21 W.S. Kang, J.H. No, H.J. Je, H. Cho, S.Y. Hwang, and B.C. Jung, "Design and Implementation of a Keyword Relevant Word Extractor for Information Search Engine," Proceeding of 2007 Fall Conference on KACE, pp. 241-246, 2007.