Browse > Article
http://dx.doi.org/10.6109/jkiice.2020.24.7.827

Prediction Model for Unpaid Customers Using Big Data  

Jeong, Jaean (Department of Computer Engineering, Paichai University)
Lee, Kyouhwan (Department of Computer Engineering, Paichai University)
Jung, Hoekyung (Department of Computer Engineering, Paichai University)
Abstract
In this paper, to reduce the unpaid rate of local governments, the internal data elements affecting the arrears in Water-INFOS are searched through interviews with meter readers in certain local governments. Candidate data affecting arrears from national statistical data were derived. The influence of the independent variable on the dependent variable was sampled by examining the disorder of the dependent variable in the data set called information gain. We also evaluated the higher prediction rates of decision tree and logistic regression using n-fold cross-validation. The results confirmed that the decision tree can find more accurate customer payment patterns than logistic regression. In the process of developing an analysis algorithm model using machine learning, the optimal values of two environmental variables, the minimum number of data and the maximum purity, which directly affect the complexity and accuracy of the decision tree, are derived to improve the accuracy of the algorithm.
Keywords
Big data analysis; Unpaid; Decision tree; Local waterworks;
Citations & Related Records
Times Cited By KSCI : 8  (Citation Analysis)
연도 인용수 순위
1 J. K. Hong, "Analysis of Sales Volume by Products According to Temperature Change Using Big Data Analysis," The Korea Journal of BigData, vol. 4 no.2. pp. 85-91, 2019.   DOI
2 S. H. Back, "Sales Volume Prediction Mode for Temperrature Change using Big Data Analysis," The Korea Journal of BigData, vol. 4 no.1, pp. 29-38, 2019.   DOI
3 D. S. Lee, "The Trends of Next Generation Cyber Security," Journal of the Korea Institute of Information and Communication Engineering, vol. 23, no. 11, pp. 1478-1481, Nov. 2019.
4 J. P. Yu, "A Model of Predictive Movie 10 Million Spectators through Big Data Analysys," The Korea Journal of BigData, vol.3, no.1, pp.63-71, 2018.   DOI
5 D. J.Park and W.S. Kim, "Improvement of the Paralled Importation Logistics Process Using Big Data," vol. 17, no. 4, pp. 267-273, Dec. 2019.   DOI
6 K. S. Choi, "K-SuperCast: A big data based GDP forecasing model," Journal of the Korea Data & Information Science Society, vol.30, no.4, pp. 723-743, 2019.   DOI
7 J. M. Jo, "Effectiveness of Normalization Pre-Processing of Big Data to the Machine Learning Performance," The Journal of the Korea Institute of Electronic Communication Science, vol.14, no.03, pp. 547-552, 2019.
8 Y. C. Choung, "Detection of redundant data in big data environment," ITPM, vol. 11, no.3, pp. 1227-1232, 2019.