Browse > Article

A Language Model and Clue based Machine Learning Method for Discovering Technology Trends from Patent Text  

Tian, Yingshi (한국과학기술원 정보통신공학과)
Kim, Young-Ho (한국과학기술원 정보통신공학과)
Jeong, Yoon-Jae (한국과학기술원 전산학과)
Ryu, Ji-Hee (한국과학기술원 전산학과)
Myaeng, Sung-Hyon (한국과학기술원 정보통신공학과)
Abstract
Patent text is a rich source for discovering technological trends. In order to automate such a discovery process, we attempt to identify phrases corresponding to the problem and its solution method which together form a technology. Problem and solution phrases are identified by a SVM classifier using features based on a combination of a language modeling approach and linguistic clues. Based on the occurrence statistics of the phrases, we identify the time span of each problem and solution and finally generate a trend. Based on our experiment, we show that the proposed semantic phrase identification method is promising with its accuracy being 77% in R-precision. We also show that the unsupervised method for discovering technological trends is meaningful.
Keywords
Patent; textual-data mining; technological trend discovery; semantic keyphrase extraction;
Citations & Related Records
연도 인용수 순위
  • Reference
1 T. Takaki, A. Fujii, and T. Ishikawa. Associative document retrieval by query subtopic analysis and its application to invalidity patent search. In Proceedings of the 13th ACM International conference on Information and Knowledge Management (CIKM '04), pp. 399-406, 2004   DOI
2 W. Pottenger and T. Yang. Detecting emerging concepts in textual data mining. Computational Information Retrieval, pp. 1-17, 2001
3 A. Shinmori, M. Okumura, Y. Marukawa, and M. Iwayama.Patent claim processing for readability: structure analysis and term explanation. In Proceedings of the ACL-03 workshop on patent corpus processing, pp. 56-65, 2003   DOI
4 L. Wanner, et al. Towards content-oriented patent document processing. World Patent Information, Vol. 30 (1), pp. 21-33, 2007   DOI   ScienceOn
5 H. Itoh, H. Mano, and Y. Ogawa, Term distillation in patent retrieval. In Proceedings of the ACL-03 workshop on patent corpus processing, pp. 41-45, 2003   DOI
6 R. Nallpati. Semantic language models for topic detection and tracking. In Proceedings of the conference of the North American chapter of the Association for Computational Linguistics on Human Language Technology (HLTNAACL'03), pp. 1-6, 2003   DOI
7 K. Lai and S. Wu. Using the patent co-citation approach to establish a new patent classification system. Information Processing and Management, Vol. 41, pp. 313-330, 2005   DOI   ScienceOn
8 B. Lent, R. Agrawal, and R. Srikant. Discovering trends in text databases. In Proceedings of the 3rd international conference on Knowledge Discovery and Data mining (KDD'97), pp. 227-230, 1997
9 K. Ahmad and A. Al-Thubaity. Can text analysis tell us something about technology progress? In Proceedings of the ACL-03 workshop on patent corpus processing, pp. 41-45, 2003   DOI
10 Library for Support Vector Machine http://www.csie.ntu.edu.tw/-cjlin/libsvm/
11 F. Bouskila and W. Pottenger. The role of semantic locality in hierarchical distributed dynamic indexing. In Proceedings of the International Conference on Artificial Intelligence (IC-AI'00), 2000
12 A. Porter and D. Jhu. Technological mapping for management of technology. In Proceedings of International Symposium on Technology, 2001
13 C. Koster, M. Seutter and J. Beney. Multi- Classification of Patent Applications with winnow. In Proceedings PSI 2003, pp. 545-554, 2003
14 A. Chakrabarti, I. Dror, and N. Eakabuse. Interorganizational transfer of knowledge: An analysis of patent citations of a defense firm. IEEE Transactions on Engineering Management, Vol. 40 (1), pp. 91-94, 1993   DOI   ScienceOn
15 B. Yoon and Y. Park. A text mining-based patent network: analytical tool for high-technology trend. Journal of High Technology Management Research, Vol. 15 (1), pp. 37-50, 2004   DOI   ScienceOn
16 Y. Kim, J. Suh, and S. Park. Visualization of patent analysis for emerging technology. Expert Systems with Applications,Vol. 34 (3), pp. 1804-1812, 2007   DOI   ScienceOn
17 Q.Mei and C.Zhai. A mixture model for contextual text mining. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge Discovery and Data mining(KDD'06), pp. 649-655, 2006   DOI
18 D. Klein and C. Manning. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL-03), pp. 423-430, 2003