한국통계학회:학술대회논문집 (Proceedings of the Korean Statistical Society Conference)
- 한국통계학회 2003년도 춘계 학술발표회 논문집
- /
- Pages.237-242
- /
- 2003
Comparing Accuracy of Imputation Methods for Incomplete Categorical Data
- Shin, Hyung-Won (Dept. of Computer Science & Industrial Systems Engineering, Yonsei University) ;
- Sohn, So-Young (Dept. of Computer Science & Industrial Systems Engineering, Yonsei University)
- 발행 : 2003.05.23
초록
Various kinds of estimation methods have been developed for imputation of categorical missing data. They include modal category method, logistic regression, and association rule. In this study, we propose two imputation methods (neural network fusion and voting fusion) that combine the results of individual imputation methods. A Monte-Carlo simulation is used to compare the performance of these methods. Five factors used to simulate the missing data are (1) true model for the data, (2) data size, (3) noise size (4) percentage of missing data, and (5) missing pattern. Overall, neural network fusion performed the best while voting fusion is better than the individual imputation methods, although it was inferior to the neural network fusion. Result of an additional real data analysis confirms the simulation result.