• 제목/요약/키워드: statistical matching

검색결과 267건 처리시간 0.02초

Statistical micro matching using a multinomial logistic regression model for categorical data

  • Kim, Kangmin;Park, Mingue
    • Communications for Statistical Applications and Methods
    • /
    • 제26권5호
    • /
    • pp.507-517
    • /
    • 2019
  • Statistical matching is a method of combining multiple sources of data that are extracted or surveyed from the same population. It can be used in situation when variables of interest are not jointly observed. It is a low-cost way to expect high-effects in terms of being able to create synthetic data using existing sources. In this paper, we propose the several statistical micro matching methods using a multinomial logistic regression model when all variables of interest are categorical or categorized ones, which is common in sample survey. Under conditional independence assumption (CIA), a mixed statistical matching method, which is useful when auxiliary information is not available, is proposed. We also propose a statistical matching method with auxiliary information that reduces the bias of the conventional matching methods suggested under CIA. Through a simulation study, proposed micro matching methods and conventional ones are compared. Simulation study shows that suggested matching methods outperform the existing ones especially when CIA does not hold.

On the Development of Probability Matching Priors for Non-regular Pareto Distribution

  • Lee, Woo Dong;Kang, Sang Gil;Cho, Jang Sik
    • Communications for Statistical Applications and Methods
    • /
    • 제10권2호
    • /
    • pp.333-339
    • /
    • 2003
  • In this paper, we develop the probability matching priors for the parameters of non-regular Pareto distribution. We prove the propriety of joint posterior distribution induced by probability matching priors. Through the simulation study, we show that the proposed probability matching Prior matches the coverage probabilities in a frequentist sense. A real data example is given.

로버스트 회귀모형을 이용한 자료결합방법 (Statistical Matching Techniques Using the Robust Regression Model)

  • 전명식;정시송;박혜진
    • 응용통계연구
    • /
    • 제21권6호
    • /
    • pp.981-996
    • /
    • 2008
  • 서로 다른 출처로부터 얻어진 데이터 파일들을 하나의 데이터 파일로 만드는 통계적 자료결합방법은 공통변수와 서로 다른 고유변수를 포함하여 변수들 간에 존재하는 관련성에 대해 살펴볼 수 있다. Robin (1986)이 제안한 일반회귀모형의 예측값을 이용한 통계적 결합방법은 자료에 대한 다변량 정규성을 가정하기 때문에 이 가정을 위반하는 자료를 이용하는 것은 많은 문제를 수반한다. 본 연구는 제공파일의 고유변수에 모분포를 반영하지 못하는 특이점이 존재하는 경우, 일반회귀모형을 이용한 통계적 결합방법의 대안으로 로러스트 회귀추정방법을 이용한 자료결합방법을 제안하였다. 나아가 로버스트 회귀모형을 이용한 결합방법과 일반회귀모형을 이용한 결합방법에서의 상관관계 및 결정계수 보존에 관한 성능을 비교하기 위하여 모의실험을 수행하였다.

Image Description and Matching Scheme Using Synthetic Features for Recommendation Service

  • Yang, Won-Keun;Cho, A-Young;Oh, Weon-Geun;Jeong, Dong-Seok
    • ETRI Journal
    • /
    • 제33권4호
    • /
    • pp.589-599
    • /
    • 2011
  • This paper presents an image description and matching scheme using synthetic features for a recommendation service. The recommendation service is an example of smart search because it offers something before a user's request. In the proposed extraction scheme, an image is described by synthesized spatial and statistical features. The spatial feature is designed to increase the discriminability by reflecting delicate variations. The statistical feature is designed to increase the robustness by absorbing small variations. For extracting spatial features, we partition the image into concentric circles and extract four characteristics using a spatial relation. To extract statistical features, we adapt three transforms into the image and compose a 3D histogram as the final statistical feature. The matching schemes are designed hierarchically using the proposed spatial and statistical features. The result shows that each feature is better than the compared algorithms that use spatial or statistical features. Additionally, if we adapt the proposed whole extraction and matching scheme, the overall performance will become 98.44% in terms of the correct search ratio.

Association Rule Mining by Environmental Data Fusion

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권2호
    • /
    • pp.279-287
    • /
    • 2007
  • Data fusion is the process of combining multiple data in order to produce information of tactical value to the user. Data fusion is generally defined as the use of techniques that combine data from multiple sources and gather that information in order to achieve inferences. Data fusion is also called data combination or data matching. Data fusion is divided in five branch types which are exact matching, judgemental matching, probability matching, statistical matching, and data linking. In this paper, we develop was macro program for statistical matching which is one of five branch types for data fusion. And then we apply data fusion and association rule techniques to environmental data.

  • PDF

A Robust Approach of Regression-Based Statistical Matching for Continuous Data

  • Sohn, Soon-Cheol;Jhun, Myoung-Shic
    • 응용통계연구
    • /
    • 제25권2호
    • /
    • pp.331-339
    • /
    • 2012
  • Statistical matching is a methodology used to merge microdata from two (or more) files into a single matched file, the variants of which have been extensively studied. Among existing studies, we focused on Moriarity and Scheuren's (2001) method, which is a representative method of statistical matching for continuous data. We examined this method and proposed a revision to it by using a robust approach in the regression step of the procedure. We evaluated the efficiency of our revised method through simulation studies using both simulated and real data, which showed that the proposed method has distinct advantages over existing alternatives.

비선형시스템의 새로운 통계적 선형화방법 (A New Statistical Linearization Technique of Nonlinear System)

  • 이장규;이연석
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1990년도 하계학술대회 논문집
    • /
    • pp.72-76
    • /
    • 1990
  • A new statistical linearization technique for nonlinear system called covariance matching method is proposed in this paper. The covariance matching method makes the mean and variance of an approximated output be identical real functional output, and the distribution of the approximated output have identical shape with a given random input. Also, the covariance matching method can be easily implemented for statistical analysis of nonlinear systems with a combination of linear system covariance analysis.

  • PDF

NONINFORMATIVE PRIORS FOR LINEAR COMBINATION OF THE INDEPENDENT NORMAL MEANS

  • Kang, Sang-Gil;Kim, Dal-Ho;Lee, Woo-Dong
    • Journal of the Korean Statistical Society
    • /
    • 제33권2호
    • /
    • pp.203-218
    • /
    • 2004
  • In this paper, we develop the matching priors and the reference priors for linear combination of the means under the normal populations with equal variances. We prove that the matching priors are actually the second order matching priors and reveal that the second order matching priors match alternative coverage probabilities up to the second order (Mukerjee and Reid, 1999) and also, are HPD matching priors. It turns out that among all of the reference priors, one-at-a-time reference prior satisfies a second order matching criterion. Our simulation study indicates that one-at-a-time reference prior performs better than the other reference priors in terms of matching the target coverage probabilities in a frequentist sense. We compute Bayesian credible intervals for linear combination of the means based on the reference priors.

A Statistical Matching Method with k-NN and Regression

  • Chung, Sung-S.;Kim, Soon-Y.;Lee, Seung-S.;Lee, Ki-H.
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권4호
    • /
    • pp.879-890
    • /
    • 2007
  • Statistical matching is a method of data integration for data sources that do not share the same units. It could produce rapidly lots of new information at low cost and decrease the response burden affecting the quality of data. This paper proposes a statistical matching technique combining k-NN (k-nearest neighborhood) and regression methods. We select k records in a donor file that have similarity in value with a specific observation of the common variable in a recipient file and estimate an imputation value for the recipient file, using regression modeling in the donor file. An empirical comparison study is conducted to show the properties of the proposed method.

  • PDF

Development of Noninformative Priors in the Burr Model

  • Cho, Jang-Sik;Kang, Sang-Gil;Baek, Sung-Uk
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권1호
    • /
    • pp.83-92
    • /
    • 2003
  • In this paper, we derive noninformative priors for the ratio of parameters in the Burr model. We obtain Jeffreys' prior, reference prior and second order probability matching prior. Also we prove that the noninformative prior matches the alternative coverage probabilities and a HPD matching prior up to the second order, respectively. Finally, we provide simulated frequentist coverage probabilities under the derived noninformative priors for small and moderate size of samples.

  • PDF