Text-mining Based Graph Model for Keyword Extraction from Patent Documents

Lee, Soon Geun;Leem, Young Moon;Um, Wan Sup;

doi:10.12812/ksms.2015.17.4.335

Journal of the Korea Safety Management & Science (대한안전경영과학회지)

Volume 17 Issue 4
/
Pages.335-342
/
2015
/
1229-6783(pISSN)
/
2288-1484(eISSN)

Korea Safety Management & Science (대한안전경영과학회)

DOI QR Code

Text-mining Based Graph Model for Keyword Extraction from Patent Documents

특허 문서로부터 키워드 추출을 위한 위한 텍스트 마이닝 기반 그래프 모델

Lee, Soon Geun (Dept. of Industrial & Management Engineering, Gangneung-Wonju National University) ;
Leem, Young Moon (Dept. of Industrial & Management Engineering, Gangneung-Wonju National University) ;
Um, Wan Sup (Dept. of Industrial & Management Engineering, Gangneung-Wonju National University)

이순근 (강릉대학교 산업경영공학과) ;
임영문 (강릉대학교 산업경영공학과) ;
엄완섭 (강릉대학교 산업경영공학과)

Received : 2015.10.20
Accepted : 2015.12.04
Published : 2015.12.31

https://doi.org/10.12812/ksms.2015.17.4.335 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The increasing interests on patents have led many individuals and companies to apply for many patents in various areas. Applied patents are stored in the forms of electronic documents. The search and categorization for these documents are issues of major fields in data mining. Especially, the keyword extraction by which we retrieve the representative keywords is important. Most of techniques for it is based on vector space model. But this model is simply based on frequency of terms in documents, gives them weights based on their frequency and selects the keywords according to the order of weights. However, this model has the limit that it cannot reflect the relations between keywords. This paper proposes the advanced way to extract the more representative keywords by overcoming this limit. In this way, the proposed model firstly prepares the candidate set using the vector model, then makes the graph which represents the relation in the pair of candidate keywords in the set and selects the keywords based on this relationship graph.

Keywords

References

Coombs, J. E. & Bierly, P. E.(2006), "Measuring technological capability and performance" R&D Management, 36(4):421-438 https://doi.org/10.1111/j.1467-9310.2006.00444.x
Feldman. R., and J. Sanger(2007), "The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data" New York, NY Cambridge University Press.
G. Salton, A. Wong and C. S. Yang(1975), "A vector space model for automatic indexing" Communications of the ACM, 18:613-620 https://doi.org/10.1145/361219.361220
I.V. Wartburg, T. Teichert, K. Rost(2005), "Inventive progress measured by multistage patent citation analysis" Research Policy 34 (10), 1591-1607. https://doi.org/10.1016/j.respol.2005.08.001
Jae Young, Chang(2013), "A study on research trends of graph-based text representations for text mining" The Journal of The Institute of Internet, Broadcasting and Communication 13: No. 5
Jens-Erik Mai(2005), "Analysis in indexing: document and domain centered approaches" Information Processing and Management 41:599-611 https://doi.org/10.1016/j.ipm.2003.12.004
Jiawei Han, Micheline Kamber(2011), "Data mining concepts and techniques" 2nd-edition Morgan Kaufmann press, 614-628
Jo, Taeho, Lee, Malrey, and Gatton, T. M.(2006), "Keyword extraction from documents using a neural network model," ICHIT'06, 2:194-197.
Kao. A. and S. R. Poteet.(2007), "Natural Language Processing and Text Mining" London Springer-Verlag, 1-7
Li, Y.R, Wang, L.H., & Hong, C. F.(2009), "Extracting the significant-rare keywords for patent analysis" Expert System with Applications, 36(6):5200-5204 https://doi.org/10.1016/j.eswa.2008.06.131
Matsuo, Y., and Ishizuka, M.(2004), "Keyword extraction from a single document using word co-occurrence statistical information," International Journal on Artificial Intelligence Tools, 13:157-169. https://doi.org/10.1142/S0218213004001466
Roberston, S.(2004), "Understanding inverse document frequency: On theoretical argument for IDF" Journal of Documentation, 60(5):503-520. https://doi.org/10.1108/00220410410560582
Yu, J. X., Kitsuregawa, M., and Leong, H. V.(2006), "Keyword Extraction using Support Vector Machine," Lecture notes in computer science, 4016:85-96.
Wang, J., Liu, J., Wang, and Cong(2007), "Keyword extraction based on PageRank," Lecture notes in computer science, 857-864.

Journal of the Korea Safety Management & Science (대한안전경영과학회지)

Text-mining Based Graph Model for Keyword Extraction from Patent Documents

특허 문서로부터 키워드 추출을 위한 위한 텍스트 마이닝 기반 그래프 모델

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)