Browse > Article
http://dx.doi.org/10.5392/IJoC.2021.17.4.079

A Study on the Prediction of Community Smart Pension Intention Based on Decision Tree Algorithm  

Liu, Lijuan (Division of Information Technology Engineering, Mokwon University)
Min, Byung-Won (Division of Information and Communication Convergence Engineering, Mokwon University)
Publication Information
Abstract
With the deepening of population aging, pension has become an urgent problem in most countries. Community smart pension can effectively resolve the problem of traditional pension, as well as meet the personalized and multi-level needs of the elderly. To predict the pension intention of the elderly in the community more accurately, this paper uses the decision tree classification method to classify the pension data. After missing value processing, normalization, discretization and data specification, the discretized sample data set is obtained. Then, by comparing the information gain and information gain rate of sample data features, the feature ranking is determined, and the C4.5 decision tree model is established. The model performs well in accuracy, precision, recall, AUC and other indicators under the condition of 10-fold cross-validation, and the precision was 89.5%, which can provide the certain basis for government decision-making.
Keywords
Decision Tree; Prediction Model; C4.5 Algorithm; Community smart pension; 10-fold cross-validation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. Qiang, J.Z. Lu, and B. Pei, "Cloud Scheduling Algorithm Based on the Decision Tree Classification," Joumal of Taiyuan University of Technology, Vol. 43, No. 6, pp.715-718,Nov. 2012, doi: http://cnki:sun:tygy.0.2012-06-018.
2 X.Y. Lin, "Comparative Study on Decision tree Algorithm in Data Mining," China Science and Technology Information, Vol. 17, No. 2, pp.94-95, Jan.1986, doi: http://cnki:sun:xxjk.0.2010-02-053.
3 Z.C. Liang, etc., "Application of 10-fold cross-validation in the evaluation of generalization ability of prediction models and the realization in R," Chinese Joumal of Hospital Statistics, Vol.27, No.4, pp.289-292, Aug.2020, doi: http://CNKI:SUN:JTYY.0.2020-04-001.
4 Fawcett. Tom, "An introduction to ROC analysis," Pattern recognition letters, Vol.27, No.8, pp.861-874, Jun.2006, doi: http://10.1016/j.patrec.2005.10.010.   DOI
5 Y.M. Wang, Study on Characteristic Genes of Pancreatic Cancer Classification Based on Multiple Data Sets, MA thesis, Southwest University, Chongqing, China, 2020.
6 J.Z. Ning, The Seventh National Population Census Bulletin (No.5), National Bureau of Statistics, May 11,2021.
7 H. Li, Statistical Learning Methods, Tsinghua University Press, Beijing, 2012.
8 W. Wang, Semi-supervised Graph Learning with Missing Data, MA thesis, South China University of Technology, Guangzhou, China, 2011.
9 J.R. Quinlan, "Induction of Decision Trees," Machine Learning(1), Vol. 1, No. 2, pp.81-106,1986, doi: http://10.1007/BF00116251.   DOI
10 S. Yadav and S. Shukla, "Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification," 2016 IEEE 6th International Conference on Advanced Computing (IACC), pp. 78-83, 2016, doi: http://10.1109/IACC.2016.25.   DOI
11 Brett Lanta, Machine learning with R, China Machine Press, Beijing, 2016.