[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.12812/ksms.2015.17.4.335

Text-mining Based Graph Model for Keyword Extraction from Patent Documents

Lee, Soon Geun (Dept. of Industrial & Management Engineering, Gangneung-Wonju National University)
Leem, Young Moon (Dept. of Industrial & Management Engineering, Gangneung-Wonju National University)
Um, Wan Sup (Dept. of Industrial & Management Engineering, Gangneung-Wonju National University)

Publication Information

Journal of the Korea Safety Management & Science / v.17, no.4, 2015 , pp. 335-342 More about this Journal

Abstract

The increasing interests on patents have led many individuals and companies to apply for many patents in various areas. Applied patents are stored in the forms of electronic documents. The search and categorization for these documents are issues of major fields in data mining. Especially, the keyword extraction by which we retrieve the representative keywords is important. Most of techniques for it is based on vector space model. But this model is simply based on frequency of terms in documents, gives them weights based on their frequency and selects the keywords according to the order of weights. However, this model has the limit that it cannot reflect the relations between keywords. This paper proposes the advanced way to extract the more representative keywords by overcoming this limit. In this way, the proposed model firstly prepares the candidate set using the vector model, then makes the graph which represents the relation in the pair of candidate keywords in the set and selects the keywords based on this relationship graph.

Keywords

Relationship graph model; Patents; Keyword extraction; Text mining;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Coombs, J. E. & Bierly, P. E.(2006), "Measuring technological capability and performance" R&D Management, 36(4):421-438 DOI
2	Feldman. R., and J. Sanger(2007), "The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data" New York, NY Cambridge University Press.
3	G. Salton, A. Wong and C. S. Yang(1975), "A vector space model for automatic indexing" Communications of the ACM, 18:613-620 DOI
4	I.V. Wartburg, T. Teichert, K. Rost(2005), "Inventive progress measured by multistage patent citation analysis" Research Policy 34 (10), 1591-1607. DOI
5	Jae Young, Chang(2013), "A study on research trends of graph-based text representations for text mining" The Journal of The Institute of Internet, Broadcasting and Communication 13: No. 5
6	Jens-Erik Mai(2005), "Analysis in indexing: document and domain centered approaches" Information Processing and Management 41:599-611 DOI
7	Jiawei Han, Micheline Kamber(2011), "Data mining concepts and techniques" 2nd-edition Morgan Kaufmann press, 614-628
8	Jo, Taeho, Lee, Malrey, and Gatton, T. M.(2006), "Keyword extraction from documents using a neural network model," ICHIT'06, 2:194-197.
9	Kao. A. and S. R. Poteet.(2007), "Natural Language Processing and Text Mining" London Springer-Verlag, 1-7
10	Li, Y.R, Wang, L.H., & Hong, C. F.(2009), "Extracting the significant-rare keywords for patent analysis" Expert System with Applications, 36(6):5200-5204 DOI
11	Matsuo, Y., and Ishizuka, M.(2004), "Keyword extraction from a single document using word co-occurrence statistical information," International Journal on Artificial Intelligence Tools, 13:157-169. DOI
12	Roberston, S.(2004), "Understanding inverse document frequency: On theoretical argument for IDF" Journal of Documentation, 60(5):503-520. DOI
13	Yu, J. X., Kitsuregawa, M., and Leong, H. V.(2006), "Keyword Extraction using Support Vector Machine," Lecture notes in computer science, 4016:85-96.
14	Wang, J., Liu, J., Wang, and Cong(2007), "Keyword extraction based on PageRank," Lecture notes in computer science, 857-864.

KSCI

Text-mining Based Graph Model for Keyword Extraction from Patent Documents 특허 문서로부터 키워드 추출을 위한 위한 텍스트 마이닝 기반 그래프 모델

Text-mining Based Graph Model for Keyword Extraction from Patent Documents