Browse > Article
http://dx.doi.org/10.5859/KAIS.2020.29.3.237

Research on Designing Korean Emotional Dictionary using Intelligent Natural Language Crawling System in SNS  

Lee, Jong-Hwa (동의대학교 e비즈니스학과)
Publication Information
The Journal of Information Systems / v.29, no.3, 2020 , pp. 237-251 More about this Journal
Abstract
Purpose The research was studied the hierarchical Hangul emotion index by organizing all the emotions which SNS users are thinking. As a preliminary study by the researcher, the English-based Plutchick (1980)'s emotional standard was reinterpreted in Korean, and a hashtag with implicit meaning on SNS was studied. To build a multidimensional emotion dictionary and classify three-dimensional emotions, an emotion seed was selected for the composition of seven emotion sets, and an emotion word dictionary was constructed by collecting SNS hashtags derived from each emotion seed. We also want to explore the priority of each Hangul emotion index. Design/methodology/approach In the process of transforming the matrix through the vector process of words constituting the sentence, weights were extracted using TF-IDF (Term Frequency Inverse Document Frequency), and the dimension reduction technique of the matrix in the emotion set was NMF (Nonnegative Matrix Factorization) algorithm. The emotional dimension was solved by using the characteristic value of the emotional word. The cosine distance algorithm was used to measure the distance between vectors by measuring the similarity of emotion words in the emotion set. Findings Customer needs analysis is a force to read changes in emotions, and Korean emotion word research is the customer's needs. In addition, the ranking of the emotion words within the emotion set will be a special criterion for reading the depth of the emotion. The sentiment index study of this research believes that by providing companies with effective information for emotional marketing, new business opportunities will be expanded and valued. In addition, if the emotion dictionary is eventually connected to the emotional DNA of the product, it will be possible to define the "emotional DNA", which is a set of emotions that the product should have.
Keywords
Big Data; TF-IDF; Non-negative Matrix Factorization; Text Mining; Emotional Word; Sentimental Analysis;
Citations & Related Records
Times Cited By KSCI : 10  (Citation Analysis)
연도 인용수 순위
1 강주연, 이이든, 김지수, "텍스트 마이닝을 활용한 'Z 세대'관련 뉴스데이터 의미연결망 분석," 미래청소년학회지, 제17권, 2020, pp. 25-48.
2 고흥석, 신중현, "디지털 네이티브 세대의 미디어 이용행태에 관한 탐색적 연구," 한국콘텐츠학회논문지, 제18권, 제3호, 2018, pp. 1-10.   DOI
3 권종원, 송태승, "제조 혁신 위한 플랫폼 기반의 디지털 트랜스포메이션 추진 동향," 전자공학회지, 제46권, 제12호, 2019, pp. 34-46.
4 김철원, 박선, "의미특징과 워드넷 기반의 의사 연관 피드백을 사용한 질의기반 문서요약," 한국정보통신학회논문지 제15권, 제7호, 2011, pp. 1517-1524.   DOI
5 박선, 김경준, 김경호, 이성로 "의미특징 기반의 용어 가중치 재산정을 이용한 문서군집의 성능 향상," 한국정보통신학회논문지, 제17권, 제2호, 2013, pp. 347-354.   DOI
6 이종화, "Python을 이용한 SNS 크롤링 시스템 구축," 한국산업정보학회논문지, 제23권, 제5호, 2018, pp. 61-76.   DOI
7 이종화, 이문봉, 김종원, "TF-IDF 를 활용한 한글 자연어 처리 연구," 정보시스템연구 제28권, 제3호, 2019, pp. 105-121.
8 이종화, 이윤재, 이현규, "SNS 의 해시태그를 이용한 감정 단어 수집 시스템 개발," 정보시스템연구, 제27권, 제2호, 2018, pp. 77-94.
9 이종화, "SNS 해시태그를 이용한 감정 단어 일반화 연구," 인터넷전자상거래연구 제18궈, 제4호, 2018, pp. 53-63.
10 이진수, "데이터 사이언스 기반의 디지털 트랜스포메이션," 방송과 미디어 제22권, 제4호, 2017, pp. 18-25.
11 Balahur, A., Hermida, J. M., and Montoyo, A., "Detecting implicit expressions of emotion in text: A comparative analysis," Decision Support Systems, Vol. 53, No. 4, 2012, pp. 742-753.   DOI
12 Cheng, F., Shen, J., Yu, Y., Li, W., Liu, G., Lee, P. W., and Tang, Y., "In silico prediction of Tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods," Chemosphere, Vol. 82, No. 11, 2011, pp. 1636-1643.   DOI
13 Christian, H., Agus, M. P., and Suhartono, D., "Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TFIDF)," ComTech: Computer, Mathematics and Engineering Applications, Vol. 7, No. 4, 2016, pp. 285-294.   DOI
14 Danielsson, P. E., "Euclidean distance mapping," Computer Graphics and image processing, Vol. 14, No. 3, 1980, pp. 227-248.   DOI
15 Gunasekaran, S., "Computer vision technology for food quality assurance," Trends in Food Science & Technology, Vol. 7, No. 8, 1996, pp. 245-256.   DOI
16 Kaelbling, L. P., Littman, M. L., and Moore, A. W., "Reinforcement learning: A survey," Journal of artificial intelligence research, Vol. 4, 1996, pp. 237-285.   DOI
17 Plutchik, R., A general psychoevolutionary theory of emotion, In Theories of emotion, 1980.
18 Qaiser, S., and Ali, R., "Text mining: use of TF-IDF to examine the relevance of words to documents," International Journal of Computer Applications, Vol. 181, No. 1, 2018, pp. 25-29.   DOI
19 Salton, G., and Buckley, C., "Term-weighting approaches in automatic text retrieval," Information processing & management, Vol. 24, No. 5, 1988, pp. 513-523.   DOI
20 Seung, D., and Lee, L., "Algorithms for non-negative matrix factorization," Advances in neural information processing systems, Vol. 13, 2001, pp. 556-562.
21 Ye, J., "Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses," Artificial intelligence in medicine, Vol. 63, No. 3, 2015, pp. 171-179.   DOI
22 Tata, S., and Patel, J. M., "Estimating the selectivity of tf-idf based cosine similarity predicates," ACM Sigmod Record, Vol. 36, No. 2, 2007, pp. 7-12.   DOI
23 Walker, M. A., Anand, P., Abbott, R., Tree, J. E. F., Martell, C., and King, J., "That is your evidence?: Classifying stance in online political debate," Decision Support Systems, Vol. 53, No. 4, 2012, pp. 719-729.   DOI
24 Witten, I. H., Frank, E., Hall, M. A., and Pal, C. J., Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann, 2016.
25 Ye, C., Yung, N. H., and Wang, D., "A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance," IEEE Transactions on Systems, Vol. 33, No. 1, 2003, pp. 17-27.