Browse > Article
http://dx.doi.org/10.13088/jiis.2017.23.3.095

A Study on the Clustering Method of Row and Multiplex Housing in Seoul Using K-Means Clustering Algorithm and Hedonic Model  

Kwon, Soonjae (Business Administration, Daegu University)
Kim, Seonghyeon (National Information Society Agency(NIA))
Tak, Onsik (KN Company)
Jeong, Hyeonhee (Business Administration, Daegu University)
Publication Information
Journal of Intelligence and Information Systems / v.23, no.3, 2017 , pp. 95-118 More about this Journal
Abstract
Recent centrally the downtown area, the transaction between the row housing and multiplex housing is activated and platform services such as Zigbang and Dabang are growing. The row housing and multiplex housing is a blind spot for real estate information. Because there is a social problem, due to the change in market size and information asymmetry due to changes in demand. Also, the 5 or 25 districts used by the Seoul Metropolitan Government or the Korean Appraisal Board(hereafter, KAB) were established within the administrative boundaries and used in existing real estate studies. This is not a district classification for real estate researches because it is zoned urban planning. Based on the existing study, this study found that the city needs to reset the Seoul Metropolitan Government's spatial structure in estimating future housing prices. So, This study attempted to classify the area without spatial heterogeneity by the reflected the property price characteristics of row housing and Multiplex housing. In other words, There has been a problem that an inefficient side has arisen due to the simple division by the existing administrative district. Therefore, this study aims to cluster Seoul as a new area for more efficient real estate analysis. This study was applied to the hedonic model based on the real transactions price data of row housing and multiplex housing. And the K-Means Clustering algorithm was used to cluster the spatial structure of Seoul. In this study, data onto real transactions price of the Seoul Row housing and Multiplex Housing from January 2014 to December 2016, and the official land value of 2016 was used and it provided by Ministry of Land, Infrastructure and Transport(hereafter, MOLIT). Data preprocessing was followed by the following processing procedures: Removal of underground transaction, Price standardization per area, Removal of Real transaction case(above 5 and below -5). In this study, we analyzed data from 132,707 cases to 126,759 data through data preprocessing. The data analysis tool used the R program. After data preprocessing, data model was constructed. Priority, the K-means Clustering was performed. In addition, a regression analysis was conducted using Hedonic model and it was conducted a cosine similarity analysis. Based on the constructed data model, we clustered on the basis of the longitude and latitude of Seoul and conducted comparative analysis of existing area. The results of this study indicated that the goodness of fit of the model was above 75 % and the variables used for the Hedonic model were significant. In other words, 5 or 25 districts that is the area of the existing administrative area are divided into 16 districts. So, this study derived a clustering method of row housing and multiplex housing in Seoul using K-Means Clustering algorithm and hedonic model by the reflected the property price characteristics. Moreover, they presented academic and practical implications and presented the limitations of this study and the direction of future research. Academic implication has clustered by reflecting the property price characteristics in order to improve the problems of the areas used in the Seoul Metropolitan Government, KAB, and Existing Real Estate Research. Another academic implications are that apartments were the main study of existing real estate research, and has proposed a method of classifying area in Seoul using public information(i.e., real-data of MOLIT) of government 3.0. Practical implication is that it can be used as a basic data for real estate related research on row housing and multiplex housing. Another practical implications are that is expected the activation of row housing and multiplex housing research and, that is expected to increase the accuracy of the model of the actual transaction. The future research direction of this study involves conducting various analyses to overcome the limitations of the threshold and indicates the need for deeper research.
Keywords
Classification of Clusters; Row Housing; Multiplex Housing; Hedonic Model; K-Means Clustering Algorithm;
Citations & Related Records
Times Cited By KSCI : 8  (Citation Analysis)
연도 인용수 순위
1 Park, W. S. and B. J. Rhlm. "A Study on the Factors Affection Apartment Price by Using Hedonic Price Model". Korea Real Estate Society, Vol. 28, No. 2 (2010). 245-271.
2 Romesburg, C., Cluster Analysis for Researchers., North Carolina: Lulu Press. 2004.
3 Redmond, S. J., and. H. Conor, "A Method for Initialising the K-means Clustering Algorithm Using Kd-trees,", Pattern recognition letters , Vol. 28, No. 8 (2007), 965-973.   DOI
4 Rosen, S., "Hedonic Prices and Implicit Markets: Product Differentiation in Pure Competition.", Journal of Political Economy, Vol. 82, No. 1 (1974), 34-55.   DOI
5 Ricardo, B. Y., and R. N. Berthier., Modern Information Retrieval., New York: ACM press, 1999.
6 Ryu, K., S. Choi, and S. Lee, "Median Price Index for Single-family housing and Multi-family housing in Seoul," Journal of the Korea Real Estate Analysts Association, Vol. 18, No. 2 (2012), 57-72.
7 Seo, S. B., and S. N. Kwak., "A Study on the Adequacy of Standard Comparison Table of Land Price by Hedonic Price Model.", Journal of Korea Planning Association, Vol. 49, No. 5 (2014), 187-204
8 Yang, M., Y. Lee., and J. S. Song., "Application of Hedonic Price Model to Korean Antique Art Data.", Journal of Information Technology Applications & Management, Vol. 23, No. 4 (2016), 41-53.   DOI
9 Yeom. M. B., and K. M. Kim., " Deriving the Causes of Low Fertility and Policy Demand through Cluster Analysis."., Journal of Economy, Vol. 29, No. 1 (2011), 163-190.
10 Yong, H. S., Y. M. Na., J. S. Park., H. W. Seung., M. S. Lee., and R. Choi., Data Mining., Seoul: Infiniti Books, 2007.
11 Yun, H. Y., Y. S. Koo and D. R. Choi. "A Development of Ensemble Model Based on Cluster Analysis to improve PM10 Forecasting Accuracy : Focus on the Weighted Average Ensemble by Weather Cluster." Journal of Korean Society of Urban Environment, Vol. 17, No. 1 (2017), 33-42.
12 Adriaans, P. and D. Zantinge, Data Mining, Addision-Wesley Harlow, 1996
13 Chen, M. S., J. Han, and P. S. Yu, "Data Mining: an Overview from a Database Perspective." IEEE Transactions on Knowledge and data Engineering, Vo1. 8, No. 6 (1996), 866-883.
14 Fayyad, U. M. "Data Mining and Knowledge Discovery: Making Sense out of Data." IEEE Expert: Intelligent Systems and Their Applications, Vol. 11, No. 5 (1996), 20-25.   DOI
15 Hall, M., I. Witten, and E. Frank., Data Mining: Practical Machine Learning Tools and Techniques., Kaufmann, Burlington, 2011
16 Jain, A. K., "Data Clustering: 50 years beyond K-means." Pattern Recognition Letters, Vol. 31, No. 8 (2010), 651-666.   DOI
17 Kang, H. C., S. T. Han, J. H. Choi, S. G. Lee., E. S. Kim, I. H. Eom., and M. G. Kim., Data Mining Methodology., Seoul: Free Academy, 2006.
18 Jang, M., and C. Kang., "A Study on the Spatial Structure of Row-House and Multi-Family House and Its Policy Implications in Seoul," Journal of the Korea Real Estate Analysts Association, Vol. 24, No. 2 (2014), 87-96.
19 Jang, N. S., S. W. Hong and J. H. Jang, Data mining, Seoul: Daechung Media, 1999
20 Jung, U. B. and H. R. Lee, "Core Attributes Influencing the Room Rate of Deluxe Hotels in Seoul: Focused on a Hedonic Price Model", Journal of Tourism Sciences, Vol. 41, No. 3 (2017), 131-149.
21 Kim, B. R., Y. I. Yoon, and M. S. Chung., "A Hedonic Model Effects for Consumeroriented Retargeting Advertising Based on Internet of Things." Journal of the Korea Society of Computer and Information, Vol. 22, No. 2 (2017), 75-80.   DOI
22 Kim, H. H., T. S. Lee., J. M. Kim., and T. H. Ahn., "Small Area Categorization by Socioeconomic Characteristics for Local Government Policy Development.", The Geographical Journal of Korea, Vol. 49, No. 2 (2015), 229-240.
23 Kim, S. W. and K. S. Chung, "Comparative Study of the Fitness between Traditional OLS Models and Spatial Econometrics Models Using the Real Transaction Housing Price in the Busan.", Journal of the Korea Real Estate Analysts Association, Vol. 16, No. 3 (2010), 41-55.
24 Kim, J. H., "An Analysis on the Spatio-temporal Heterogeneity of Real Transaction Price of Apartment in Seoul Using the Geostatistical Methods", Journal of the Korean Society for Geospatial Information Science, Vol. 24, No. 4 (2016), 75-81.
25 Kim, J. I., "The Comparison of Local Housing Price Determinants by Housing Type", Housing Studies Review, Vol. 25, No. 2 (2017), 175-195.
26 Lee, C., J. Lee, and S. Lim, "The Non-Apartment Rental Housing Market Analysis," Journal of the Korea Real Estate Analysts Association, Vol. 13, No. 1 (2007), 25-47.
27 Kim, J. M., "New Optimization Algorithm for Data Clustering", Journal of Intelligence and Information Systems, Vol. 13, No. 3 (2007), 31-45.
28 Koo, W. Y. "Understanding Data Mining and Utilizing the Mechanical Field " Magazine of the SAREK, Vol. 45, No. 1 (2016), 38-43.
29 Kwon, J. W., and H.C. Kim. "Estimation of Housing Price Index using a Varying Parameter Model." Journal of the Korean Urban Management Association, Vol. 19, No. 1 (2006), 175-200.
30 Lee, S. W., and J. Y. Kim, "Transactions Clustering based on Item Similarity", Journal of Intelligence and Information Systems, Vol. 9, No. 1 (2003), 179-193.
31 Lee, S. W. and W. H. Lee, "Refining Initial Seeds using Max Average Distance for K-Means Clustering." Journal of Korean Society for Internet Information, Vo.12, No. 2 (2011), 103-111.
32 Lee, S. W., "Comparison of Initial Seeds Methods for K-Means Clustering." Journal of Korean Society for Internet Information, Vol. 13, No. 6 (2012), 1-8.
33 Lee, G., and K. Kim, "A Study on the Spatial Mismatch between the Assessed Land Value and Housing Market Price: Exploring the Scale Effect of the MAUP." Journal of the Korean Geographical Society, Vol. 48, No. 6 (2013), 879-896.
34 Lee, Y. M, "A Review of the Hedonic Price Model." Journal of the Korea Real Estate Analysis Association, Vol. 14, No. 1 (2008), 81-87.
35 Leonard, T., T. M. Powell-Wiley., C. Ayers., J. C. Murdoch, W. Yin, and S. L. Pruitt, "Property Values as a Measure of Neighborhoods: An Application of Hedonic Price Theory." Epidemiology, Vol. 27, No. 4 (2016), 518-524.   DOI
36 Meghani, S. H., and G. J. Knafl., "Salient Concerns in Using Analgesia for Cancer Pain among Outpatients: A Cluster Analysis Study." World Journal of Clinical Oncology, Vol. 8, No. 1 (2017), 75.   DOI
37 Lloyd, S., "Least Squares Quantization in PCM," IEEE Transactions on Information Theory, Vol. 28, No. 2 (1982), 129-137.   DOI
38 MacQueen, J., "Some Methods for Classification and Analysis of Multivariate Observations," Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1. No. 14 (1967), 281-297.
39 Malpezzi, S., "Hedonic Pricing Models: a Selective and Applied Review." In: O'Sullivan, T., Gibb, K. (Eds.), Housing Economics and Public Policy., Blackwell, Oxford, UK, 2002, 67-89.
40 Na, M. Y., A Technique to Extract Useful Knowledge from a Large Knowledge Database., Data Base World, 1997.
41 Nam, J. and J. H.Kim. "An Analysis of Factor Influencing on the Choice of Housing Types and Tenure by Income Bracket in Seoul" Journal of the Korean Urban Management Association, Vol. 28, No. 2 (2015), 199-222.
42 Brachman, R. J., and T. Anand., The Process of Knowledge Discovery in Databases., Advances in Knowledge Discovery and Data Mining, 1996
43 Arthur, D. and S. Vassilvitskii. "How Slow is the K-means Method?." Proceedings of the Twenty-Second Annual Symposium on Computational Geometry. ACM (2006), 144-153.
44 Anderberg, M. R., Cluster Analysis for Applications. Monographs and Textbooks on Probability and 15 Mathematical Statistics., in Academic Press, Inc., New York, 1973.
45 Berry, M. J. A and G. S. Linoff, Data Mining Techniques for Marketing, Sales and Customer Relationship Management, Third Edition, John Wiley & Sons Inc, 2011.
46 Pejman, A., G. N. Bidhendi, M. Ardestani, M. Saeedi, and A. Baghvand, " Fractionation of Heavy Metals in Sediments and Assessment of their Availability Risk: A Case Study in the Northwestern of Persian Gulf." Marine Pollution Bulletin, Vol. 114 No. 2 (2017), 881-887.   DOI
47 Park, D. H., H. K. Kim, I. Y. Choi, and J. K. Kim, "A Literature Review and Classification of Recommender Systems on Academic Journals", Journal of Intelligence and Information Systems, Vol. 17, No. 1 (2011), 139-152.
48 Benfratello, L., M. Piacenza, and S. Sacchetto. "Taste or Reputation: What Drives Market Prices in the Wine Industry? Estimation of a Hedonic Model for Italian Premium Wines." Applied Economics, Vol. 41, No. 17 (2009), 2197-2209.   DOI