Browse > Article
http://dx.doi.org/10.29220/CSAM.2019.26.5.507

Statistical micro matching using a multinomial logistic regression model for categorical data  

Kim, Kangmin (Department of Statistics, Korea University)
Park, Mingue (Department of Statistics, Korea University)
Publication Information
Communications for Statistical Applications and Methods / v.26, no.5, 2019 , pp. 507-517 More about this Journal
Abstract
Statistical matching is a method of combining multiple sources of data that are extracted or surveyed from the same population. It can be used in situation when variables of interest are not jointly observed. It is a low-cost way to expect high-effects in terms of being able to create synthetic data using existing sources. In this paper, we propose the several statistical micro matching methods using a multinomial logistic regression model when all variables of interest are categorical or categorized ones, which is common in sample survey. Under conditional independence assumption (CIA), a mixed statistical matching method, which is useful when auxiliary information is not available, is proposed. We also propose a statistical matching method with auxiliary information that reduces the bias of the conventional matching methods suggested under CIA. Through a simulation study, proposed micro matching methods and conventional ones are compared. Simulation study shows that suggested matching methods outperform the existing ones especially when CIA does not hold.
Keywords
statistical matching; multinomial logistic regression model; conditional independence assumption; auxiliary information;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Budd EC (1971). The creation of a microdata file for estimating the size distribution of income, The Review of Income and Wealth, 17, 317-333.   DOI
2 D'Orazio M, Di Zio M, and Scanu M. (2006). Statistical Matching: Theory and Practice, JohnWiley & Sons, Chichester.
3 D'Orazio M (2017). Statistical matching and imputation of survey data with statmatch (Technical Paper). Available from: https://cran.r-project.org/web/packages/StatMatch/vignettes/StatisticalMatching with StatMatch.pdf
4 Maddala GS (1983). Limited-dependent and Qualitative Variables in Econometrics, Cambridge University Press, Cambridge.
5 Okner BA (1972). Constructing a new data base from existing microdata sets: the 1966 merge file, Annals of Economic and Social Measurement, 1, 325-342.
6 Paass G (1986). Statistical matching: evaluation of existing procedures and improvements be using additional information. In Microanalytic Simulation Models to Support Social and Financial Policy, Elsevier Science, Amsterdam.
7 Renssen RH (1998). Use of statistical matching techniques in calibration estimation, Survey Methodology, 24, 171-183.
8 Rodgers WL (1984). An evaluation of statistical matching, Journal of Business and Economic Statistics, 2, 91-102.   DOI
9 Rubin DB (1986). Statistical matching using file concatenation with adjusted weights and multiple imputations, Journal of Business and Economic Statistics, 4, 87-94.   DOI
10 Sims CA (1972). Comment on Okner (1972), Annals of Economic and Social Measurement, 1, 343-345.
11 Singh AC (1988). Log-linear imputation, Methodology Branch Working Paper, SSMD, 88-029E, Statistics Canada; also published in Proceedings of the Fifth Annual Research Conference, U.S. Bureau of the Census, 118-132.
12 Singh AC, Mantel H, Kinack M, and Rowe G (1993). Statistical matching: use of auxiliary information as an alternative to the conditional independence assumption, Survey Methodology, 19, 59-79.