DOI QR코드

DOI QR Code

A Suggestion for Spatiotemporal Analysis Model of Complaints on Officially Assessed Land Price by Big Data Mining

빅데이터 마이닝에 의한 공시지가 민원의 시공간적 분석모델 제시

  • Cho, Tae In (Department of Civil Affairs, Jung-gu Office, Incheon Metropolitan City) ;
  • Choi, Byoung Gil (Dept. of Civil and Environmental Engineering, Incheon National University) ;
  • Na, Young Woo (Hub-Indestrial-Academic Cooperation, Incheon National University) ;
  • Moon, Young Seob (Earth Order & Construction Corp.) ;
  • Kim, Se Hun (Dept. of Civil and Environmental Engineering, Incheon National University)
  • 조태인 (인천중구청 민원지적과) ;
  • 최병길 (인천대학교 건설환경공학부) ;
  • 나영우 (인천대학교 산학협력단 산학협력중점) ;
  • 문영섭 (이오산업개발주식회사) ;
  • 김세훈 (인천대학교 건설환경공학부)
  • Received : 2018.10.04
  • Accepted : 2018.11.22
  • Published : 2018.12.10

Abstract

The purpose of this study is to suggest a model analysing spatio-temporal characteristics of the civil complaints for the officially assessed land price based on big data mining. Specifically, in this study, the underlying reasons for the civil complaints were found from the spatio-temporal perspectives, rather than the institutional factors, and a model was suggested monitoring a trend of the occurrence of such complaints. The official documents of 6,481 civil complaints for the officially assessed land price in the district of Jung-gu of Incheon Metropolitan City over the period from 2006 to 2015 along with their temporal and spatial poperties were collected and used for the analysis. Frequencies of major key words were examined by using a text mining method. Correlations among mafor key words were studied through the social network analysis. By calculating term frequency(TF) and term frequency-inverse document frequency(TF-IDF), which correspond to the weighted value of key words, I identified the major key words for the occurrence of the civil complaint for the officially assessed land price. Then the spatio-temporal characteristics of the civil complaints were examined by analysing hot spot based on the statistics of Getis-Ord $Gi^*$. It was found that the characteristic of civil complaints for the officially assessed land price were changing, forming a cluster that is linked spatio-temporally. Using text mining and social network analysis method, we could find out that the occurrence reason of civil complaints for the officially assessed land price could be identified quantitatively based on natural language. TF and TF-IDF, the weighted averages of key words, can be used as main explanatory variables to analyze spatio-temporal characteristics of civil complaints for the officially assessed land price since these statistics are different over time across different regions.

이 연구는 빅데이터 마이닝에 기초하여 공시지가 민원에 대한 시공간적 특성을 분석하는 모델을 제시하는 데 목적이 있다. 특히 이 연구는 행정 민원이 제기되는 원인을 학술적 요인보다는 시공간적 측면에서 찾았고, 그러한 민원 발생의 경향을 시공간적으로 모니터링하는 모델을 제시하였다. 2006년부터 2015년까지 인천광역시 중구의 공시지가에 대한 6,481개의 민원정보가 시간 및 공간적 특성을 고려해 수집되었고 분석을 위해 사용되었다. 텍스트 마이닝 기법을 이용해 주요 키워드의 빈도수를 도출했으며, 소셜 네트워크 분석을 통해 주요 키워드 간의 관계를 분석하였다. 키워드의 가중치와 연관되는 TF(term frequency)와 TF-IDF(term frequency-inverse document frequency)를 산출함으로써, 공시지가의 민원 발생에 대한 주요 키워드를 식별하였다. 마지막으로 Getis-Ord의 $Gi^*$의 통계량에 기초한 핫스팟 분석을 통해 공시지가 민원의 시공간적 특성을 분석하였다. 연구 결과, 공시지가 민원의 특성은 시공간적으로 연계된 군집 형태를 형성하면서 변화하고 있음을 알 수 있었다. 텍스트 마이닝과 소셜 네트워크 분석 방법을 이용하여 자연어 기반의 공시지가 민원에 대한 발생 원인을 정량적으로 규명할 수 있음을 알 수 있었으며, 키워드 가중치인 단어 빈도(TF) 및 단어 빈도와 역문서 빈도의 조합값(TF-IDF)의 상대적인 차이가 있어 시공간적인 민원 특성을 분석하기 위한 주요 설명변수로 활용될 수 있음을 알 수 있었다.

Keywords

References

  1. Kim YH. 2007. Social Network Analysis. Pakyoungsa
  2. Park HJ. 2015. Pattern Analysis of Environment Complaint Using the Spatial Big Data Mining [Thesis]. Incheon National University.
  3. Seo JS. 2015. Data R Love. The Areum.
  4. Lee YW. 2007. Spatiotemporal Hotspot Detection Using G Statistics. Seoul Studies. 8(3):71-83.
  5. Lee HY, Sim JH. 2011. Geographic Information Science. Bobmunsa.
  6. Won TH, Yoo HH. 2016. Pattern Analysis for Civil Complaints of Local Governments Using a Text Mining. Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography. 34(3):319-327 https://doi.org/10.7848/KSGPC.2016.34.3.319
  7. Jung CW. 2013. (A) New Method of Key Word Extraction from Foresight Based on Textmining, Complexity Network Analysis, and Internet Big Data[dissertation]. Hanyang University.
  8. Cho TI. 2016. Spatiotemporal Characteristic Analysis by Big Data Mining[dissertation]. Incheon National University.
  9. Choi BG, Na YW, Hyeon CS, Cho TI. 2018. A Study on the Satisfaction Analysis on Officially Assessed Land Price Using Time Seriate Geostatistical Analysis. Journal of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography. 36(2):95-104 https://doi.org/10.7848/KSGPC.2018.36.2.95
  10. National Information Society Agency. 2013. Big Data Era Opening New Future.
  11. Hong JY. 2013. Development of Traffic Accident Frequecny Prediction Models with Spatial Econometrics Analysis in Urban Areas[dissertation]. University of Seoul.
  12. Borgatti, S. P.,3 Everett, M. G. and Freeman, L. C. 2002. Ucinet for windows : software for social network analysis. Analytic Technologies.
  13. Dursun D. 2015. Real-world data mining: applied business analytics and decision making. 1st edition. Pearson Education.
  14. Getis A., Ord J. 1995. Local spatial autocorrelation statistics: distributional issues and an application. Geographical Analysis. 27(4):286-306. https://doi.org/10.1111/j.1538-4632.1995.tb00912.x
  15. Moran, P. 1950. Notes on continuous stochastic phenomena. Biometrika. 37:17-23. https://doi.org/10.1093/biomet/37.1-2.17
  16. Scott, J. 2000. Social network analysis. SAGE Publications.
  17. Tobler W. 1970. A computer movie simulating urban growth in the detroit region. Economic Geography. 46(2):234-240. https://doi.org/10.2307/143141
  18. Upton, G. J. G. 1990. Information from regional data. Spatial Statistics : Past Present and Future, Monograph. 12:315-359.

Cited by

  1. 공시지가의 형평성에 관한 연구 - 서울특별시를 중심으로 - vol.50, pp.2, 2020, https://doi.org/10.22640/lxsiri.2020.50.2.133