• 제목/요약/키워드: Statistical matching method

검색결과 131건 처리시간 0.689초

Statistical micro matching using a multinomial logistic regression model for categorical data

  • Kim, Kangmin;Park, Mingue
    • Communications for Statistical Applications and Methods
    • /
    • 제26권5호
    • /
    • pp.507-517
    • /
    • 2019
  • Statistical matching is a method of combining multiple sources of data that are extracted or surveyed from the same population. It can be used in situation when variables of interest are not jointly observed. It is a low-cost way to expect high-effects in terms of being able to create synthetic data using existing sources. In this paper, we propose the several statistical micro matching methods using a multinomial logistic regression model when all variables of interest are categorical or categorized ones, which is common in sample survey. Under conditional independence assumption (CIA), a mixed statistical matching method, which is useful when auxiliary information is not available, is proposed. We also propose a statistical matching method with auxiliary information that reduces the bias of the conventional matching methods suggested under CIA. Through a simulation study, proposed micro matching methods and conventional ones are compared. Simulation study shows that suggested matching methods outperform the existing ones especially when CIA does not hold.

Statistical Matching Techniques Using the Robust Regression Model (로버스트 회귀모형을 이용한 자료결합방법)

  • Jhun, Myoung-Shic;Jung, Ji-Song;Park, Hye-Jin
    • The Korean Journal of Applied Statistics
    • /
    • 제21권6호
    • /
    • pp.981-996
    • /
    • 2008
  • Statistical matching techniques whose aim is to achieve a complete data file from different sources. Since the statistical matching method proposed by Rubin (1986) assumes the multivariate normality for data, using this method to data which violates the assumption would involve some problems. This research proposed the statistical matching method using robust regression as an alternative to the linear regression. Furthermore, we carried out a simulation study to compare the performance of the robust regression model and the linear regression model for the statistical matching.

A Robust Approach of Regression-Based Statistical Matching for Continuous Data

  • Sohn, Soon-Cheol;Jhun, Myoung-Shic
    • The Korean Journal of Applied Statistics
    • /
    • 제25권2호
    • /
    • pp.331-339
    • /
    • 2012
  • Statistical matching is a methodology used to merge microdata from two (or more) files into a single matched file, the variants of which have been extensively studied. Among existing studies, we focused on Moriarity and Scheuren's (2001) method, which is a representative method of statistical matching for continuous data. We examined this method and proposed a revision to it by using a robust approach in the regression step of the procedure. We evaluated the efficiency of our revised method through simulation studies using both simulated and real data, which showed that the proposed method has distinct advantages over existing alternatives.

A New Statistical Linearization Technique of Nonlinear System (비선형시스템의 새로운 통계적 선형화방법)

  • Lee, Jang-Gyu;Lee, Yeon-Seok
    • Proceedings of the KIEE Conference
    • /
    • 대한전기학회 1990년도 하계학술대회 논문집
    • /
    • pp.72-76
    • /
    • 1990
  • A new statistical linearization technique for nonlinear system called covariance matching method is proposed in this paper. The covariance matching method makes the mean and variance of an approximated output be identical real functional output, and the distribution of the approximated output have identical shape with a given random input. Also, the covariance matching method can be easily implemented for statistical analysis of nonlinear systems with a combination of linear system covariance analysis.

  • PDF

A Statistical Matching Method with k-NN and Regression

  • Chung, Sung-S.;Kim, Soon-Y.;Lee, Seung-S.;Lee, Ki-H.
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권4호
    • /
    • pp.879-890
    • /
    • 2007
  • Statistical matching is a method of data integration for data sources that do not share the same units. It could produce rapidly lots of new information at low cost and decrease the response burden affecting the quality of data. This paper proposes a statistical matching technique combining k-NN (k-nearest neighborhood) and regression methods. We select k records in a donor file that have similarity in value with a specific observation of the common variable in a recipient file and estimate an imputation value for the recipient file, using regression modeling in the donor file. An empirical comparison study is conducted to show the properties of the proposed method.

  • PDF

Adaptive Matching Method of Rigid and Deformable Object Image using Statistical Analysis of Matching-pairs (정합 쌍의 통계적 분석을 이용한 정형/비정형 객체 영상의 적응적 정합 방법)

  • Won, In-Su;Yang, Hun-Jun;Jang, Hyeok;Jeong, Dong-Seok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • 제52권1호
    • /
    • pp.102-110
    • /
    • 2015
  • In this paper, adaptive matching method using the same features for rigid and deformable object images is proposed. Firstly, we determine whether the two images are matched or not using the geometric verification and generate the matching information. Decision boundary which separates deformable matching-pair from non-matching pair is obtained through statistical analysis of matching information. The experimental result shows that the proposed method lowers the computational complexity and increases the matching accuracy compared to the existing method.

A Study on a Statistical Matching Method Using Clustering for Data Enrichment

  • Kim Soon Y.;Lee Ki H.;Chung Sung S.
    • Communications for Statistical Applications and Methods
    • /
    • 제12권2호
    • /
    • pp.509-520
    • /
    • 2005
  • Data fusion is defined as the process of combining data and information from different sources for the effectiveness of the usage of useful information contents. In this paper, we propose a data fusion algorithm using k-means clustering method for data enrichment to improve data quality in knowledge discovery in database(KDD) process. An empirical study was conducted to compare the proposed data fusion technique with the existing techniques and shows that the newly proposed clustering data fusion technique has low MSE in continuous fusion variables.

Statistical Fingerprint Recognition Matching Method with an Optimal Threshold and Confidence Interval

  • Hong, C.S.;Kim, C.H.
    • The Korean Journal of Applied Statistics
    • /
    • 제25권6호
    • /
    • pp.1027-1036
    • /
    • 2012
  • Among various biometrics recognition systems, statistical fingerprint recognition matching methods are considered using minutiae on fingerprints. We define similarity distance measures based on the coordinate and angle of the minutiae, and suggest a fingerprint recognition model following statistical distributions. We could obtain confidence intervals of similarity distance for the same and different persons, and optimal thresholds to minimize two kinds of error rates for distance distributions. It is found that the two confidence intervals of the same and different persons are not overlapped and that the optimal threshold locates between two confidence intervals. Hence an alternative statistical matching method can be suggested by using nonoverlapped confidence intervals and optimal thresholds obtained from the distributions of similarity distances.

Local-Based Iterative Histogram Matching for Relative Radiometric Normalization

  • Seo, Dae Kyo;Eo, Yang Dam
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • 제37권5호
    • /
    • pp.323-330
    • /
    • 2019
  • Radiometric normalization with multi-temporal satellite images is essential for time series analysis and change detection. Generally, relative radiometric normalization, which is an image-based method, is performed, and histogram matching is a representative method for normalizing the non-linear properties. However, since it utilizes global statistical information only, local information is not considered at all. Thus, this paper proposes a histogram matching method considering local information. The proposed method divides histograms based on density, mean, and standard deviation of image intensities, and performs histogram matching locally on the sub-histogram. The matched histogram is then further partitioned and this process is performed again, iteratively, controlled with the wasserstein distance. Finally, the proposed method is compared to global histogram matching. The experimental results show that the proposed method is visually and quantitatively superior to the conventional method, which indicates the applicability of the proposed method to the radiometric normalization of multi-temporal images with non-linear properties.

Improving Bagging Predictors

  • Kim, Hyun-Joong;Chung, Dong-Jun
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 한국통계학회 2005년도 추계 학술발표회 논문집
    • /
    • pp.141-146
    • /
    • 2005
  • Ensemble method has been known as one of the most powerful classification tools that can improve prediction accuracy. Ensemble method also has been understood as ‘perturb and combine’ strategy. Many studies have tried to develop ensemble methods by improving perturbation. In this paper, we propose two new ensemble methods that improve combining, based on the idea of pattern matching. In the experiment with simulation data and with real dataset, the proposed ensemble methods peformed better than bagging. The proposed ensemble methods give the most accurate prediction when the pruned tree was used as the base learner.

  • PDF