• 제목/요약/키워드: Statistical matching method

검색결과 131건 처리시간 0.019초

Statistical micro matching using a multinomial logistic regression model for categorical data

  • Kim, Kangmin;Park, Mingue
    • Communications for Statistical Applications and Methods
    • /
    • 제26권5호
    • /
    • pp.507-517
    • /
    • 2019
  • Statistical matching is a method of combining multiple sources of data that are extracted or surveyed from the same population. It can be used in situation when variables of interest are not jointly observed. It is a low-cost way to expect high-effects in terms of being able to create synthetic data using existing sources. In this paper, we propose the several statistical micro matching methods using a multinomial logistic regression model when all variables of interest are categorical or categorized ones, which is common in sample survey. Under conditional independence assumption (CIA), a mixed statistical matching method, which is useful when auxiliary information is not available, is proposed. We also propose a statistical matching method with auxiliary information that reduces the bias of the conventional matching methods suggested under CIA. Through a simulation study, proposed micro matching methods and conventional ones are compared. Simulation study shows that suggested matching methods outperform the existing ones especially when CIA does not hold.

로버스트 회귀모형을 이용한 자료결합방법 (Statistical Matching Techniques Using the Robust Regression Model)

  • 전명식;정시송;박혜진
    • 응용통계연구
    • /
    • 제21권6호
    • /
    • pp.981-996
    • /
    • 2008
  • 서로 다른 출처로부터 얻어진 데이터 파일들을 하나의 데이터 파일로 만드는 통계적 자료결합방법은 공통변수와 서로 다른 고유변수를 포함하여 변수들 간에 존재하는 관련성에 대해 살펴볼 수 있다. Robin (1986)이 제안한 일반회귀모형의 예측값을 이용한 통계적 결합방법은 자료에 대한 다변량 정규성을 가정하기 때문에 이 가정을 위반하는 자료를 이용하는 것은 많은 문제를 수반한다. 본 연구는 제공파일의 고유변수에 모분포를 반영하지 못하는 특이점이 존재하는 경우, 일반회귀모형을 이용한 통계적 결합방법의 대안으로 로러스트 회귀추정방법을 이용한 자료결합방법을 제안하였다. 나아가 로버스트 회귀모형을 이용한 결합방법과 일반회귀모형을 이용한 결합방법에서의 상관관계 및 결정계수 보존에 관한 성능을 비교하기 위하여 모의실험을 수행하였다.

A Robust Approach of Regression-Based Statistical Matching for Continuous Data

  • Sohn, Soon-Cheol;Jhun, Myoung-Shic
    • 응용통계연구
    • /
    • 제25권2호
    • /
    • pp.331-339
    • /
    • 2012
  • Statistical matching is a methodology used to merge microdata from two (or more) files into a single matched file, the variants of which have been extensively studied. Among existing studies, we focused on Moriarity and Scheuren's (2001) method, which is a representative method of statistical matching for continuous data. We examined this method and proposed a revision to it by using a robust approach in the regression step of the procedure. We evaluated the efficiency of our revised method through simulation studies using both simulated and real data, which showed that the proposed method has distinct advantages over existing alternatives.

비선형시스템의 새로운 통계적 선형화방법 (A New Statistical Linearization Technique of Nonlinear System)

  • 이장규;이연석
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1990년도 하계학술대회 논문집
    • /
    • pp.72-76
    • /
    • 1990
  • A new statistical linearization technique for nonlinear system called covariance matching method is proposed in this paper. The covariance matching method makes the mean and variance of an approximated output be identical real functional output, and the distribution of the approximated output have identical shape with a given random input. Also, the covariance matching method can be easily implemented for statistical analysis of nonlinear systems with a combination of linear system covariance analysis.

  • PDF

A Statistical Matching Method with k-NN and Regression

  • Chung, Sung-S.;Kim, Soon-Y.;Lee, Seung-S.;Lee, Ki-H.
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권4호
    • /
    • pp.879-890
    • /
    • 2007
  • Statistical matching is a method of data integration for data sources that do not share the same units. It could produce rapidly lots of new information at low cost and decrease the response burden affecting the quality of data. This paper proposes a statistical matching technique combining k-NN (k-nearest neighborhood) and regression methods. We select k records in a donor file that have similarity in value with a specific observation of the common variable in a recipient file and estimate an imputation value for the recipient file, using regression modeling in the donor file. An empirical comparison study is conducted to show the properties of the proposed method.

  • PDF

정합 쌍의 통계적 분석을 이용한 정형/비정형 객체 영상의 적응적 정합 방법 (Adaptive Matching Method of Rigid and Deformable Object Image using Statistical Analysis of Matching-pairs)

  • 원인수;양훈준;장혁;정동석
    • 전자공학회논문지
    • /
    • 제52권1호
    • /
    • pp.102-110
    • /
    • 2015
  • 본 논문은 동일한 특징을 사용하여 정형 객체와 비정형 객체 영상들을 정합할 수 있는 적응형 정합 방법을 제안한다. 이를 위한 방법으로 우선 기하학적 검증으로 두 영상의 정합 여부를 결정하고 정합 정보를 생성한다. 그리고 정합 정보의 통계적 분석을 통해 비정형 정합 쌍과 비정합 정합 쌍을 분류하는 결정 경계를 구한다. 제안된 방법의 성능 평가 결과는 기존의 방법과 비교하였을 때, 복잡도는 낮았으며, 정합 성공률과 정확도는 높아짐을 보여주었다.

A Study on a Statistical Matching Method Using Clustering for Data Enrichment

  • Kim Soon Y.;Lee Ki H.;Chung Sung S.
    • Communications for Statistical Applications and Methods
    • /
    • 제12권2호
    • /
    • pp.509-520
    • /
    • 2005
  • Data fusion is defined as the process of combining data and information from different sources for the effectiveness of the usage of useful information contents. In this paper, we propose a data fusion algorithm using k-means clustering method for data enrichment to improve data quality in knowledge discovery in database(KDD) process. An empirical study was conducted to compare the proposed data fusion technique with the existing techniques and shows that the newly proposed clustering data fusion technique has low MSE in continuous fusion variables.

Statistical Fingerprint Recognition Matching Method with an Optimal Threshold and Confidence Interval

  • Hong, C.S.;Kim, C.H.
    • 응용통계연구
    • /
    • 제25권6호
    • /
    • pp.1027-1036
    • /
    • 2012
  • Among various biometrics recognition systems, statistical fingerprint recognition matching methods are considered using minutiae on fingerprints. We define similarity distance measures based on the coordinate and angle of the minutiae, and suggest a fingerprint recognition model following statistical distributions. We could obtain confidence intervals of similarity distance for the same and different persons, and optimal thresholds to minimize two kinds of error rates for distance distributions. It is found that the two confidence intervals of the same and different persons are not overlapped and that the optimal threshold locates between two confidence intervals. Hence an alternative statistical matching method can be suggested by using nonoverlapped confidence intervals and optimal thresholds obtained from the distributions of similarity distances.

Local-Based Iterative Histogram Matching for Relative Radiometric Normalization

  • Seo, Dae Kyo;Eo, Yang Dam
    • 한국측량학회지
    • /
    • 제37권5호
    • /
    • pp.323-330
    • /
    • 2019
  • Radiometric normalization with multi-temporal satellite images is essential for time series analysis and change detection. Generally, relative radiometric normalization, which is an image-based method, is performed, and histogram matching is a representative method for normalizing the non-linear properties. However, since it utilizes global statistical information only, local information is not considered at all. Thus, this paper proposes a histogram matching method considering local information. The proposed method divides histograms based on density, mean, and standard deviation of image intensities, and performs histogram matching locally on the sub-histogram. The matched histogram is then further partitioned and this process is performed again, iteratively, controlled with the wasserstein distance. Finally, the proposed method is compared to global histogram matching. The experimental results show that the proposed method is visually and quantitatively superior to the conventional method, which indicates the applicability of the proposed method to the radiometric normalization of multi-temporal images with non-linear properties.

Improving Bagging Predictors

  • Kim, Hyun-Joong;Chung, Dong-Jun
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2005년도 추계 학술발표회 논문집
    • /
    • pp.141-146
    • /
    • 2005
  • Ensemble method has been known as one of the most powerful classification tools that can improve prediction accuracy. Ensemble method also has been understood as ‘perturb and combine’ strategy. Many studies have tried to develop ensemble methods by improving perturbation. In this paper, we propose two new ensemble methods that improve combining, based on the idea of pattern matching. In the experiment with simulation data and with real dataset, the proposed ensemble methods peformed better than bagging. The proposed ensemble methods give the most accurate prediction when the pruned tree was used as the base learner.

  • PDF