DOI QR코드

DOI QR Code

기계학습 알고리즘을 이용한 주택 모기지 금리에 대한 시민들의 감정예측

Prediction of Citizens' Emotions on Home Mortgage Rates Using Machine Learning Algorithms

  • Kim, Yun-Ki (Department of Land Management, Choengju University)
  • 투고 : 2019.04.16
  • 심사 : 2019.06.18
  • 발행 : 2019.06.30

초록

본 연구의 목적은 기계학습 알고리즘을 이용하여 주택모기지 금리에 대한 시민들의 감정을 예측하는 것이었다. 연구목적을 달성하기 위해 본 연구는 관련문헌을 검토한 다음 두개의 연구 질문을 설정하였다. 또한 연구 질문에 대한 답을 구하기 위해 본 연구는 Akman의 분류에 따라 감정을 분류 한 다음 여섯 가지 기계학습 알고리즘을 이용하여 모기지 금리에 대한 시민들의 감정을 예측하였다. 분석결과 AdaBoost가 모든 평가범주에서 가장 우수한 분류기로 확인되었다. 그러나 Naive Bayes의 성능수준은 다른 분류기들의 성능수준보다 낮은 것으로 밝혀졌다. 또한 본 연구는 어느 분류기가 각 감정범주를 잘 예측해주는지를 파악하기 위해 ROC 분석을 실시하였다. 분석결과, AdaBoost가 모든 감정범주에서 주택모기지 금리에 대한 주민들의 감정을 가장 잘 예측해주는 것으로 확인되었다. 그러나 슬픔범주에서 여섯 가지 알고리즘의 성능수준은 다른 감정범주보다 훨씬 낮게 나타났다.

This study attempted to predict citizens' emotions regarding mortgage rates using machine learning algorithms. To accomplish the research purpose, I reviewed the related literature and then set up two research questions. To find the answers to the research questions, I classified emotions according to Akman's classification and then predicted citizens' emotions on mortgage rates using six machine learning algorithms. The results showed that AdaBoost was the best classifier in all evaluation categories. However, the performance level of Naive Bayes was found to be lower than those of other classifiers. Also, this study conducted a ROC analysis to identify which classifier predicts each emotion category well. The results demonstrated that AdaBoost was the best predictor of the residents' emotions on home mortgage rates in all emotion categories. However, in the sadness class, the performance levels of the six algorithms used in this study were much lower than those in the other emotion categories.

키워드

HGGTBM_2019_v49n1_65_f0001.png 이미지

Figure 1. ROC curves for anger emotion class

HGGTBM_2019_v49n1_65_f0002.png 이미지

Figure 2. ROC curves for disgust emotion class

HGGTBM_2019_v49n1_65_f0003.png 이미지

Figure 3. ROC curves for fear emotion class

HGGTBM_2019_v49n1_65_f0004.png 이미지

Figure 4. ROC curves for joy emotion class

HGGTBM_2019_v49n1_65_f0005.png 이미지

Figure 5. ROC curves for sadness emotion class

HGGTBM_2019_v49n1_65_f0006.png 이미지

Figure 6. ROC curves for surprise emotion class

Table 1. Performance Evaluation Results

HGGTBM_2019_v49n1_65_t0001.png 이미지

Table 2. Confusion Matrix for Adaboost

HGGTBM_2019_v49n1_65_t0002.png 이미지

Table 3. Confusion Matrix for Random Forest

HGGTBM_2019_v49n1_65_t0003.png 이미지

Table 4. Confusion Matrix for Decision

HGGTBM_2019_v49n1_65_t0004.png 이미지

Table 5. Confusin Matrix for KNN

HGGTBM_2019_v49n1_65_t0005.png 이미지

Table 6. Confusion Matrix for Logistic Regression

HGGTBM_2019_v49n1_65_t0006.png 이미지

Table 7. Confusion Matrix for Naive Bayes

HGGTBM_2019_v49n1_65_t0007.png 이미지

참고문헌

  1. Alkan A, Koklukaya E, Subasi A. 2005. Automatic seizure detection in EEG using logistic regression and artificial neural network. Journal of Neuroscience Methods, 148(2):167-176. https://doi.org/10.1016/j.jneumeth.2005.04.009
  2. Almeida AM, Cerri R, Paraiso EC, Mantovani RG, Junior SB. 2018. Applying multi-label techniques in emotion identification of short texts. Neurocomputing, 320:35-46. https://doi.org/10.1016/j.neucom.2018.08.053
  3. Arnold MB. 1960. Emotion and personality. Vol. I. Psychological aspects.
  4. Avetisyan H, Bruna O, Holub J. 2016. Overview of existing algorithms for emotion classification. Uncertainties in evaluations of accuracies. Journal of Physics: Conference Series. 772(1):012039. https://doi.org/10.1088/1742-6596/772/1/012039
  5. Bravo-Marquez F, Mendoza M, Poblete B. 2013. Combining strengths, emotions and polarities for boosting Twitter sentiment analysis. Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining.
  6. Breiman L. 1999. Random forests. UC Berkeley TR567.
  7. Burnap P, Colombo W, Scourfield J. 2015. Machine classification and analysis of suicide-related communication on twitter. Proceedings of the 26th ACM conference on hypertext & social media p. 75-84.
  8. Burnap P, Williams ML. 2015. Cyber hate speech on twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & Internet. 7(2):223-242. https://doi.org/10.1002/poi3.85
  9. Cannuscio CC, Alley DE, Pagan JA, Soldo B, Krasny S, Shardell M, Lipman TH. 2012. Housing strain, mortgage foreclosure, and health. Nursing outlook, 60(3):134-142. https://doi.org/10.1016/j.outlook.2011.08.004
  10. Casale S, Russo A, Scebba G, Serrano S. 2008. Speech emotion classification using machine learning algorithms. The IEEE International Conference on Semantic Computing, p. 158-165. IEEE.
  11. Chen YL, Chang CL, Yeh CS. 2017. Emotion classification of YouTube videos. Decision Support Systems. 101:40-50. https://doi.org/10.1016/j.dss.2017.05.014
  12. Colneriĉ N, Demsar J. 2018. Emotion Recognition on Twitter: Comparative Study and Training a Unison Model. IEEE Transactions on Affective Computing.
  13. Damrongsakmethee T, Neagoe VE. 2017. Data Mining and Machine Learning for Financial Analysis. Indian Journal of Science and Technology. 10(39).
  14. Dong L, Li X, Xie G. 2014. Nonlinear methodologies for identifying seismic event and nuclear explosion using random forest, support vector machine, and naive Bayes classification. Abstract and Applied Analysis. Vol. 2014. Hindawi.
  15. Du W, Zhan Z. 2002. Building decision tree classifier on private data. Proceedings of the IEEE international conference on Privacy, security and data mining. 14:1-8.
  16. Ekman P. 1992. An argument for basic emotions. Cognition & emotion. 6(3-4):169-200. https://doi.org/10.1080/02699939208411068
  17. Ekman P, Friesen WV. 1971. Constants across cultures in the face and emotion. Journal of personality and social psychology. 17(2):124. https://doi.org/10.1037/h0030377
  18. Frijda NH. 1986. The emotions. Cambridge University Press.
  19. Gievska S, Koroveshovski K. 2014. The impact of affective verbal content on predicting personality impressions in youtube videos. Proceedings of the 2014 ACM Multi Media on Workshop on Computational Personality Recognition. p. 19-22.
  20. Hastie T, Rosset S, Zhu J, Zou H. 2009. Multi-class adaboost. Statistics and its Interface. 2(3):349-360. https://doi.org/10.4310/SII.2009.v2.n3.a8
  21. Hu W, Hu W, Maybank S. 2008. Adaboost-based algorithm for network intrusion detection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 38(2):577-583. https://doi.org/10.1109/TSMCB.2007.914695
  22. Gerardi K, Shapiro AH, Willen P. 2007. Subprime outcomes: Risky mortgages, homeownership experiences, and foreclosures.
  23. Gil GB, de Jesus AB, Lopez JMM. 2013. Combining machine learning techniques and natural language processing to infer emotions using Spanish Twitter corpus. International Conference on Practical Applications of Agents and Multi-Agent Systems. p. 149-157.
  24. Gislason PO, Benediktsson JA, Sveinsson JR. 2006. Random forests for land cover classification. Pattern Recognition Letters, 27(4):294-300. https://doi.org/10.1016/j.patrec.2005.08.011
  25. Go A, Bhayani R, Huang L. 2009. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford. 1(12).
  26. Jang E, Rak B, Kim S, Sohn J. 2012. Emotion classification by machine learning algorithm using physiological signals. Proc. of Computer Science and Information Technology. Singapore, 25:1-5.
  27. Lim JS, Kim JM. 2014. An empirical comparison of machine learning models for classifying emotions in Korean Twitter. Journal of Korea Multimedia Society. 17(2):232-239. https://doi.org/10.9717/kmms.2014.17.2.232
  28. Luo C, Wu D, Wu D. 2017. A deep learning approach for credit scoring using credit default swaps. Engineering Applications of Artificial Intelligence. 65:465-470. https://doi.org/10.1016/j.engappai.2016.12.002
  29. Mohammad SM. 2012. Emotional tweets. Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation. p. 246-255.
  30. Moore-Kochlacs CE. 2016. Extracellular electrophysiology with close-packed recording sites: spike sorting and characterization [dissertation].
  31. Neethu MS, Rajasree R. 2013. Sentiment analysis in twitter using machine learning techniques. 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT). p.1-5.
  32. Ng AY, Jordan MI. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Advances in neural information processing systems. p. 841-848.
  33. Ortony A, Turner TJ. 1990. What's basic about basic emotions?. Psychological review. 97(3):315. https://doi.org/10.1037/0033-295X.97.3.315
  34. Pal M. 2005. Random forest classifier for remote sensing classification. International Journal of Remote Sensing. 26(1):217-222. https://doi.org/10.1080/01431160412331269698
  35. Pal M. 2006. Support vector machine-based feature selection for land cover classification: a case study with DAIS hyperspectral data. International Journal of Remote Sensing. 27(14):2877-2894. https://doi.org/10.1080/01431160500242515
  36. Peterson LE. 2009. K-nearest neighbor. Scholarpedia. 4(2):1883. https://doi.org/10.4249/scholarpedia.1883
  37. Plutchik R. 1990. Emotions and psychotherapy: A psychoevolutionary perspective. In Emotion, psychopathology, and psychotherapy. p. 3-41.
  38. Plutchik R. 1991. The emotions. University Press of America.
  39. Roberts K, Roach MA, Johnson J, Guthrie J, Harabagiu SM. 2012. EmpaTweet: Annotating and Detecting Emotions on Twitter. LREC. 12:3806-3813.
  40. Rodriguez-Galiano VF, Ghimire B, Rogan J, Chica-Olmo M, Rigol-Sanchez JP. 2012. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS Journal of Photogrammetry and Remote Sensing. 67:93-104. https://doi.org/10.1016/j.isprsjprs.2011.11.002
  41. Safavian SR, Landgrebe D. 1991. A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics. 21(3):660-674. https://doi.org/10.1109/21.97458
  42. Sitthi A, Nagai, M., Dailey, M., & Ninsawat, S. 2016. Exploring land use and land cover of geotagged social-sensing images using naive bayes classifier. Sustainability. 8(9):921. https://doi.org/10.3390/su8090921
  43. Strapparava C, Valitutti A. 2004. Wordnet affect: an affective extension of wordnet. Lrec. 4: 1083-1086.
  44. Subasi A, Ercelebi E. 2005. Classification of EEG signals using neural network and logistic regression. Computer methods and programs in biomedicine. 78(2):87-99. https://doi.org/10.1016/j.cmpb.2004.10.009
  45. Tan S. 2006. An effective refinement strategy for KNN text classifier. Expert Systems with Applications. 30(2):290-298. https://doi.org/10.1016/j.eswa.2005.07.019
  46. Tang D, Qin B, Liu T, Li Z. 2013. Learning sentence representation for emotion classification on microblogs. Natural Language Processing and Chinese Computing. p. 212-223. Springer, Berlin, Heidelberg.
  47. Tang D, Wei F, Qin B, Liu T, Zhou M. 2014. Coooolll: A deep learning system for twitter sentiment classification. Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). p. 208-212.
  48. Vatsavai RR, Bright E, Varun C, Budhendra B, Cheriyadat A, Grasser J. 2011. Machine learning approaches for high-resolution urban land cover classification: a comparative study. Proceedings of the 2nd International Conference on Computing for Geospatial Research & Applications. p.11.
  49. Velasquez F, Gordon J. 2012. Empirical study of machine learning based approach for opinion mining in tweets. Mexican international conference on Artificial intelligence. p. 1-14. Springer, Berlin, Heidelberg.
  50. Wang W, Chen L, Thirunarayan K, Sheth AP. 2012. Harnessing twitter "big data" for automatic emotion identification. Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Confernece on Social Computing (SocialCom). p. 587-592. IEEE.
  51. Wei Y, Bing X, Chareonsak C. 2004. FPGA implementation of AdaBoost algorithm for detection of face biometrics. Biomedical Circuits and Systems, 2004 IEEE International Workshop on. p. S1-6.
  52. Wen S, Wan X. 2014. Emotion classification in microblog texts using class sequential rules. In Twenty-Eighth AAAI conference on artificial intelligence.
  53. Wu Y, Zhang T, Hou X, Xu C. 2016. New Blind Steganalysis Framework Combining Image Retrieval and Outlier Detection. KSII Transactions on Internet &Information Systems. 10(12):6206-6212.
  54. Yadav P, Aggarwal G. 2015. Speech Emotion Classification using Machine Learning. International Journal of Computer Applications. 118(13):44-47. https://doi.org/10.5120/20809-3564
  55. Zhang Y, Liu S. 2018. Analysis of structural brain MRI and multi-parameter classification for Alzheimer’s disease. Biomedical Engineering/Biomedizinische Technik. 63(4):427-437. https://doi.org/10.1515/bmt-2016-0239