Browse > Article
http://dx.doi.org/10.7848/ksgpc.2017.35.5.415

A Combinatorial Optimization for Influential Factor Analysis: a Case Study of Political Preference in Korea  

Yun, Sung Bum (Dep. of Civil Engineering, Yonsei University)
Yoon, Sanghyun (Dep. of Civil Engineering, Yonsei University)
Heo, Joon (Dep. of Civil Engineering, Yonsei University)
Publication Information
Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography / v.35, no.5, 2017 , pp. 415-422 More about this Journal
Abstract
Finding influential factors from given clustering result is a typical data science problem. Genetic Algorithm based method is proposed to derive influential factors and its performance is compared with two conventional methods, Classification and Regression Tree (CART) and Chi-Squared Automatic Interaction Detection (CHAID), by using Dunn's index measure. To extract the influential factors of preference towards political parties in South Korea, the vote result of $18^{th}$ presidential election and 'Demographic', 'Health and Welfare', 'Economic' and 'Business' related data were used. Based on the analysis, reverse engineering was implemented. Implementation of reverse engineering based approach for influential factor analysis can provide new set of influential variables which can present new insight towards the data mining field.
Keywords
Reverse Engineering; Influential Factor Analysis; Classification and Regression Tree (CART); Chi-Squared Automatic Interaction Detection (CHAID); Genetic Algorithm (GA);
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Bertoni, N., Burnett, C., Cruz, M. S., Andrade, T., Bastos, F. I., Leal, E., and Fischer, B. (2014), Exploring sex differences in drug use, health and service use characteristics among young urban crack users in Brazil, International Journal for Equity in Health, Vol. 13, No.1, pp. 70-80.   DOI
2 Bonham, D. R., Goodrum, P. M., Littlejohn, R., and Albattah, M. A. (2017), Application of data mining techniques to quantify the relative influence of design and installation characteristics on labor productivity, Journal of Construction Engineering and Management, Vol. 143, No. 8, pp. 52-62.
3 Deb, K., Agrawal, S., Pratap, A., and Meyarivan, T. (2000), A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II, Proceedings of International Conference on Parallel Problem Solving from Nature, ICPPSN, 18-20 Sep, Paris, France, Vol. 6, pp. 849-858.
4 Diaz-Perez, F. M. and Bethencourt-Cejas, M. (2016), CHAID algorithm as an appropriate analytical method for tourism market segmentation, Journal of Destination Marketing & Management, Vol. 5, No. 3, pp. 275-282.   DOI
5 Gandomi, A. H., Sajedi, S., Kiani, B., and Huang, Q. (2016), Genetic programming for experimental big data mining: a case study on concrete creep formulation, Automation in Construction, Vol. 70, pp. 89-97.   DOI
6 Ghomi, H., Fu, L., Bagheri, M., and Miranda-Moreno, L. F. (2017), Identifying vehicle driver injury severity factors at highway-railway grade crossings using data mining algorithms, Proceedings of Transportation Information and Safety, ICTIS, 8-10 August, Edmonton, Canada, Vol. 4, No.1, pp. 1054-1059.
7 Grabmeier, J. L. and Lambe, L. A. (2007), Decision trees for binary classification variables grow equally with the gini impurity measure and pearson's chi-square test, International Journal of Business Intelligence and Data Mining, Vol. 2, No. 2, pp. 213-226.   DOI
8 Han, K. H., Park, K. H., Lee, C. H., and Kim, J. H. (2001), Parallel quantum-inspired genetic algorithm for combinatorial optimization problem, Proceedings of Congress on Evolutionary Computation, CEC, 27-30 May, Seoul, South Korea, Vol. 2, pp. 1422-1429.
9 He, J., Lu, Y., Zhang, P., Gao, F., Wang, X., Wang, J., and Qian, W. (2016), Prediction of 10kV distribution feeder monthly outage rate based on decision tree, Proceedings of China International Conference of Electricity Distribution, CICED, 10-13 August, Xi'an, China, Vol. 7, No. 1, pp. 1918-1923.
10 Kim, H. J., Jung, J. H., Lee, J. B., Kim, S. M., and Heo, J. (2014), Selection of optimal variables for clustering of Seoul using genetic algorithm, Journal of Korean Society for Geospatial Information System, Vol. 22, No. 4, pp. 175-181. (in Korean with English abstract)   DOI
11 Li, X. and Ye, N. (2001), Decision tree classifiers for computer intrusion detection, Journal of Parallel and Distributed Computing Practices, Vol. 4, No. 2, pp. 179-190.
12 Maulik, U. and Bandyopadhyay, S. (2002), Performance evaluation of some clustering algorithms and validity indices, Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 12, pp. 1650-1654.   DOI
13 Murray, G. R. and Scime, A. (2010), Microtargeting and electorate segmentation: data mining the American national election studies, Journal of Political Marketing, Vol. 9, No. 3, pp. 143-166.   DOI
14 Park, K.Y. and Kim B.S. (2016), The analysis of moderating effects of media consumption on the differentiated voting Patterns by generation, Journal of Speech, Media & Communication Association, Vol. 15, No. 2, pp. 316-352. (in Korean with English abstract)
15 Ramaswami, M. and Bhaskaran, R. (2010), A CHAID based performance prediction model in educational data mining, International Journal of Computer Science Issues, Vol. 7, No. 1, pp. 10-18.
16 Rodriguez-Galiano, V., Mendes, M. P., Garcia-Soldado, M. J., Chica-Olmo, M., and Ribeiro, L. (2014), Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting Southern Spain, Science of the Total Environment, Vol. 476, pp. 189-206.
17 Scime, A. and Murray, G. R. (2013), Social science data analysis, In: Rahman. H. and Ramos. I. (eds.), Ethical Data Mining Applications for Socio-Economic Development, IGI Global, Pennsylvania, pp. 131-147.
18 Yang, W., Chan, F. T., and Kumar, V. (2012), Optimizing replenishment polices using genetic algorithm for singlewarehouse multi-retailer system, Expert Systems with Applications, Vol. 39, No. 3, pp. 3081-3086.   DOI
19 Song, K. (2017), Characteristics of the general election and the voter's determinants of voting: focusing of the Daegu, Gyeongbuk area, National Knowledge Information System, Vol. 31, No. 1, pp. 157-182.
20 Vaishnav, P., Choudhary, N., and Jain, K. (2017), Traveling salesman problem using genetic algorithm: a survey, International Journal of Scientific Research in Computer Science, Engineering and Information Technology, Vol. 2, No. 3, pp. 105-108.
21 Yoo, K., Shukla, S. K., Ahn, J. J., Oh, K., and Park, J. (2016), Decision tree-based data mining and rule induction for identifying hydrogeological parameters that influence groundwater pollution sensitivity, Journal of Cleaner Production, Vol. 122, pp. 277-286.   DOI