Browse > Article
http://dx.doi.org/10.3743/KOSIM.2018.35.1.157

Generating and Controlling an Interlinking Network of Technical Terms to Enhance Data Utilization  

Jeong, Do-Heon (덕성여자대학교 문헌정보학과)
Publication Information
Journal of the Korean Society for information Management / v.35, no.1, 2018 , pp. 157-182 More about this Journal
Abstract
As data management and processing techniques have been developed rapidly in the era of big data, nowadays a lot of business companies and researchers have been interested in long tail data which were ignored in the past. This study proposes methods for generating and controlling a network of technical terms based on text mining technique to enhance data utilization in the distribution of long tail theory. Especially, an edit distance technique of text mining has given us efficient methods to automatically create an interlinking network of technical terms in the scholarly field. We have also used linked open data system to gather experimental data to improve data utilization and proposed effective methods to use data of LOD systems and algorithm to recognize patterns of terms. Finally, the performance evaluation test of the network of technical terms has shown that the proposed methods were useful to enhance the rate of data utilization.
Keywords
long tail theory; linked open data; language resources; DBLP; edit distance algorithm;
Citations & Related Records
연도 인용수 순위
  • Reference
1 안광모, 김윤석, 김영훈, 서영훈 (2013). Levenshtein 거리를 이용한 영화평 감성 분류. 디지털콘텐츠학회 논문지, 14(4), 581-587. http://dx.doi.org/10.9728/dcs.2013.14.4.581 Ahn, K. M., Kim, Y. S., Kim, Y. H., & Seo, Y. H. (2013). Sentiment classification of movie reviews using levenshtein distance. Journal of Digital Contents Society, 14(4), 581-587. http://dx.doi.org/10.9728/dcs.2013.14.4.581   DOI
2 황미녕, 조민희, 황명권, 정도헌, 성원경 (2011). 기술 용어의 용어지배값을 이용한 활용주기 모델링 방법. 한국정보과학회 학술발표논문집, 38(1C), 139-141. Hwang, M. N., Cho, M., Hwang, M., Jeong, D. H., & Sung, W. K. (2011). A utility cycle modeling method for technological terms based on term dominance value. Proceedings of the KIISE Conference, 38(1C), 139-141.
3 Abe, A., & Tsumoto, S. (2010). Analysis of research keys as temporal patterns of technical term usage in bibliographical data. Lecture Notes in Computer Science book series (LNCS, volume 6496), International Conference on Active Media Technology AMT 2010, 150-157. https://doi.org/10.1007/978-3-642-15470-6_16
4 Cormode, G., & Muthukrishnan, S. (2007). The string edit distance matching problem with moves. ACM Transactions on Algorithms, NY, USA, 3(1), No.2. https://doi.org/10.1145/1186810.1186812   DOI
5 Fortune (2017). Apple just acquired this little-known artificial intelligence startup. Retrieved from http://fortune.com/2017/05/13/apple-lattice/
6 Gartner (2018). Dark data (Gartner IT Glossary). Retrieved from https://www.gartner.com/it-glossary/dark-data
7 Heidorn, P. B. (2008). Shedding light on the dark data in the long tail of science. Library Trends, 57(2), 280-299. https://doi.org/10.1353/lib.0.0036   DOI
8 Hwang, M. N., Cho, M. H., Hwang. M., Lee, M., & Jeong, D. H. (2014). Technical terms trends analysis method for technology opportunity discovery. Information, An International Interdisciplinary Journal, 17(3), 877-883.
9 Jain, P., Hitzler, P., Sheth, A. P., Verma, K., & Yeh, P. Z. (2010). Ontology alignment for linked open data. Lecture Notes in Computer Science book series (LNCS, volume 6496) ISWC 2010: The Semantic Web, 402-417. https://doi.org/10.1007/978-3-642-17746-0_26
10 Jeong, D. H., Hwang, M., & Sung, W. K. (2011). Generating knowledge map for acronymexpansion recognition. In the Proceedings on U- and E-Service Science and Technology (UNESST 2011), 287-293. https://doi.org/10.1007/978-3-642-27210-3_38
11 Jeong, D. H., Hwang, M., Kim, J., Jung, H., & Sung, W. K. (2013). Acronym-expansion recognition based on knowledge map system. Information, An International Interdisciplinary Journal, 12(A), 8403-8408.
12 Paulheim, H., & Fümkranz, J. (2012). Unsupervised generation of data mining features from linked open data. Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics, No. 31. https://doi.org/10.1145/2254129.2254168
13 Kim, J., Hwang, M., Jeong, D. H., & Jung, H. (2012). Technology trends analysis and forecasting application based on decision tree. Expert Systems with Applications and Statistical Feature Analysis, 39(2012), 12618-12625. https://doi.org/10.1016/j.eswa.2012.05.021   DOI
14 Li, Q., Li, Y., Gao, J., Su, L., Zhao, B., Demirbas, M., Fan, W., & Han, J. (2014). A confidenceaware approach for truth discovery on long-tail data. Journal Proceedings of the VLDB Endowment, 8(4), 425-436. https://doi.org/10.14778/2735496.2735505   DOI
15 Noia, T. D., Mirizzi, R., Ostuni, V. C., Romito, D., & Zanker, M. (2012). Linked open data to support content-based recommender systems. Proceedings of the 8th International Conference on Semantic Systems, 1-8. https://doi.org/10.1145/2362499.2362501
16 Reis, D. C., Golgher, P. B., Silva, A. S., & Laender, A. F. (2004). Automatic web news extraction using tree edit distance. Proceedings of the 13th International Conference on World Wide Web, 502-511. https://doi.org/10.1145/988672.988740
17 Veritas (2016). Veritas global databerg report finds 85% of stored data is either dark or Redundant, Obsolete, or Trivial (ROT). Retrieved from https://www.veritas.com/news-releases/2016-03-15-veritas-global-databerg-report-finds-85-percent-of-stored-data
18 Wikipedia (2018a). Long tail. Retrieved from https://en.wikipedia.org/wiki/Long_tail
19 Wikipedia (2018b). X-ray diffraction (redirection). Retrieved from https://en.wikipedia.org/wiki/X-ray_crystallography
20 Wikipedia (2018c). High-performance liquid chromatography. Retrieved from https://en.wikipedia.org/wiki/High-performance_liquid_chromatography
21 Wikipedia (2018d). Edit distance. Retrieved from https://en.wikipedia.org/wiki/Edit_distance
22 Wu, F., Hoffmann, R., & Weld, D. S. (2008). Information extraction from Wikipedia: moving down the long tail. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 731-739. https://doi.org/10.1145/1401890.1401978
23 Zhang, C., Shin, J., Ré, C., Cafarella, M., & Niu, F. (2016). Extracting databases from dark data with deepdive. Proceedings of the 2016 International Conference on Management of Data, 847-859. https://doi.org/10.1145/2882903.2904442