• Title/Summary/Keyword: 다변량 정규분포

Search Result 57, Processing Time 0.02 seconds

혼합설계의 교호작용에 대한 여러 검정법들과 결사평균을 이용하여 변형한 검정법들의 강인성 비교

  • 김현철
    • Communications for Statistical Applications and Methods
    • /
    • v.5 no.3
    • /
    • pp.633-644
    • /
    • 1998
  • 혼합설계의 교호작용에 대한 F 검정이 유효하려면 다표본 구형성(multisample sphericity) 가정과 다변량 정규분포 가정이 만족되어야 한다. F 검정을 실시하기 위한 가정들이 위반된 조건하에서 혼합설계의 교호작용에 대한 검정법들의 1종오류가 비교되었다. 비교된 검정법들은 (1) F 검정(F), (2) 절사평균을 사용한 F 검정($F_T$)(3)$\varepsilon$-수정 F 검정($\varepsilon)$(4) 절사평균을 사용한 $\varepsilon$-수정 F 검정$(\varepsilon_T$) (5) CIGA검정(CIGA), (6) 절사평균을 사용한 CIGA검정($CIGA_T$)이었다. 결과는 CIGA와 $CIGA_T$는 1종오류를 대체로 잘 관리하나, F검정들과 ($\varepsilon$)검정들은 일부 조건에서 아주 작은 1종오류나 아주 큰 1종오류를 갖는 것으로 나타났다.

  • PDF

Estimation of Spatial Distribution Using the Gaussian Mixture Model with Multivariate Geoscience Data (다변량 지구과학 데이터와 가우시안 혼합 모델을 이용한 공간 분포 추정)

  • Kim, Ho-Rim;Yu, Soonyoung;Yun, Seong-Taek;Kim, Kyoung-Ho;Lee, Goon-Taek;Lee, Jeong-Ho;Heo, Chul-Ho;Ryu, Dong-Woo
    • Economic and Environmental Geology
    • /
    • v.55 no.4
    • /
    • pp.353-366
    • /
    • 2022
  • Spatial estimation of geoscience data (geo-data) is challenging due to spatial heterogeneity, data scarcity, and high dimensionality. A novel spatial estimation method is needed to consider the characteristics of geo-data. In this study, we proposed the application of Gaussian Mixture Model (GMM) among machine learning algorithms with multivariate data for robust spatial predictions. The performance of the proposed approach was tested through soil chemical concentration data from a former smelting area. The concentrations of As and Pb determined by ex-situ ICP-AES were the primary variables to be interpolated, while the other metal concentrations by ICP-AES and all data determined by in-situ portable X-ray fluorescence (PXRF) were used as auxiliary variables in GMM and ordinary cokriging (OCK). Among the multidimensional auxiliary variables, important variables were selected using a variable selection method based on the random forest. The results of GMM with important multivariate auxiliary data decreased the root mean-squared error (RMSE) down to 0.11 for As and 0.33 for Pb and increased the correlations (r) up to 0.31 for As and 0.46 for Pb compared to those from ordinary kriging and OCK using univariate or bivariate data. The use of GMM improved the performance of spatial interpretation of anthropogenic metals in soil. The multivariate spatial approach can be applied to understand complex and heterogeneous geological and geochemical features.

Geostatistical Integration of Ground Survey Data and Secondary Data for Geological Thematic Mapping (지질 주제도 작성을 위한 지표 조사 자료와 부가 자료의 지구통계학적 통합)

  • Park, No-Wook;Jang, Dong-Ho;Chi, Kwang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.22 no.6
    • /
    • pp.581-593
    • /
    • 2006
  • Various geological thematic maps have been generated by interpolating sparsely sampled ground survey data and geostatistical kriging that can consider spatial correlation between neighboring data has widely been used. This paper applies multi-variate geostatistical algorithms to integrate secondary information with sparsely sampled ground survey data for geological thematic mapping. Simple kriging with local means and kriging with an external drift are applied among several multi-variate geostatistical algorithms. Two case studies for spatial mapping of groundwater level and grain size have been carried out to illustrate the effectiveness of multi-variate geostatistical algorithms. A digital elevation model and IKONOS remote sensing imagery were used as secondary information in two case studies. Two multi-variate geostatistical algorithms, which can account for both spatial correlation of neighboring data and secondary data, showed smaller prediction errors and more local variations than those of ordinary kriging and linear regression. The benefit of applying the multi-variate geostatistical algorithms, however, depends on sampling density, magnitudes of correlation between primary and secondary data, and spatial correlation of primary data. As a result, the experiment for spatial mapping of grain size in which the effects of those factors were dominant showed that the effect of using the secondary data was relatively small than the experiment for spatial mapping of groundwater level.

Saddlepoint Approximations to the Distribution Function of Non-homogeneous Quadratic Forms (비동차 이차형식의 분포함수에 대한 안장점근사)

  • Na Jong-Hwa;Kim Jeong-Soak
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.1
    • /
    • pp.183-196
    • /
    • 2005
  • In this paper we studied the saddlepoint approximations to the distribution of non-homogeneous quadratic forms in normal variables. The results are the extension of Kuonen's which provide the same approximations to homogeneous quadratic forms. The CGF of interested statistics and related properties are derived for applications of saddlepoint techniques. Simulation results are also provided to show the accuracy of saddlepoint approximations.

Estimated Soft Information based Most Probable Classification Scheme for Sorting Metal Scraps with Laser-induced Breakdown Spectroscopy (레이저유도 플라즈마 분광법을 이용한 폐금속 분류를 위한 추정 연성정보 기반의 최빈 분류 기술)

  • Kim, Eden;Jang, Hyemin;Shin, Sungho;Jeong, Sungho;Hwang, Euiseok
    • Resources Recycling
    • /
    • v.27 no.1
    • /
    • pp.84-91
    • /
    • 2018
  • In this study, a novel soft information based most probable classification scheme is proposed for sorting recyclable metal alloys with laser induced breakdown spectroscopy (LIBS). Regression analysis with LIBS captured spectrums for estimating concentrations of common elements can be efficient for classifying unknown arbitrary metal alloys, even when that particular alloy is not included for training. Therefore, partial least square regression (PLSR) is employed in the proposed scheme, where spectrums of the certified reference materials (CRMs) are used for training. With the PLSR model, the concentrations of the test spectrum are estimated independently and are compared to those of CRMs for finding out the most probable class. Then, joint soft information can be obtained by assuming multi-variate normal (MVN) distribution, which enables to account the probability measure or a prior information and improves classification performance. For evaluating the proposed schemes, MVN soft information is evaluated based on PLSR of LIBS captured spectrums of 9 metal CRMs, and tested for classifying unknown metal alloys. Furthermore, the likelihood is evaluated with the radar chart to effectively visualize and search the most probable class among the candidates. By the leave-one-out cross validation tests, the proposed scheme is not only showing improved classification accuracies but also helpful for adaptive post-processing to correct the mis-classifications.

Statistical Estimation for Hazard Function and Process Capability Index under Bivariate Exponential Process (이변량 지수 공정 하에서 위험함수와 공정능력지수에 대한 통계적 추정)

  • Cho, Joong-Jae;Kang, Su-Mook;Park, Byoung-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.3
    • /
    • pp.449-461
    • /
    • 2009
  • Higher sigma quality level is generally perceived by customers as improved performance by assigning a correspondingly higher satisfaction score. The process capability indices and the sigma level $Z_{st}$ ave been widely used in six sigma industries to assess process performance. Most evaluations on process capability indices focus on statistical estimation under normal process which may result in unreliable assessments of process performance. In this paper, we consider statistical estimation for bivariate VPCI(Vector-valued Process Capability Index) $C_{pkl}=(C_{pklx},\;C_{pklx})$ under Marshall and Olkin (1967)'s bivariate exponential process. First, we derive some limiting distribution for statistical inference of bivariate VPCI $C_{pkl}$. And we propose two asymptotic normal confidence regions for bivariate VPCI $C_{pkl}$. The proposed method may be very useful under bivariate exponential process. A numerical result based on our proposed method shows to be more reliable.

Analysis of the Levy Mutation Operations in the Evolutionary prograamming using Mean Square Displacement and distinctness (평균변화율 및 유일성을 통한 진화 프로그래밍에서 레비 돌연변이 연산 분석)

  • Lee, Chang-Yong
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.11
    • /
    • pp.833-841
    • /
    • 2001
  • Abstract In this work, we analyze the Levy mutation operations based on the Levy probability distribution in the evolutionary programming via the mean square displacement and the distinctness. The Levy probability distribution is characterized by an infinite second moment and has been widely studied in conjunction with the fractals. The Levy mutation operators not only generate small varied offspring, but are more likely to generate large varied offspring than the conventional mutation operators. Based on this fact, we prove mathematically, via the mean square displacement and the distinctness, that the Levy mutation operations can explore and exploit a search space more effectively. As a result, one can get better performance with the Levy mutation than the conventional Gaussian mutation for the multi-valued functional optimization problems.

  • PDF

A mixed model for repeated split-plot data (반복측정의 분할구 자료에 대한 혼합모형)

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.1
    • /
    • pp.1-9
    • /
    • 2010
  • This paper suggests a mixed-effects model for analyzing split-plot data when there is a repeated measures factor that affects on the response variable. Covariance structures are discussed among the observations because of the assumption of a repeated measures factor as one of explanatory variables. As a plausible covariance structure, compound symmetric covariance structure is assumed for analyzing data. The restricted maximum likelihood (REML)method is used for estimating fixed effects in the model.

A Study on the Training Optimization Using Genetic Algorithm -In case of Statistical Classification considering Normal Distribution- (유전자 알고리즘을 이용한 트레이닝 최적화 기법 연구 - 정규분포를 고려한 통계적 영상분류의 경우 -)

  • 어양담;조봉환;이용웅;김용일
    • Korean Journal of Remote Sensing
    • /
    • v.15 no.3
    • /
    • pp.195-208
    • /
    • 1999
  • In the classification of satellite images, the representative of training of classes is very important factor that affects the classification accuracy. Hence, in order to improve the classification accuracy, it is required to optimize pre-classification stage which determines classification parameters rather than to develop classifiers alone. In this study, the normality of training are calculated at the preclassification stage using SPOT XS and LANDSAT TM. A correlation coefficient of multivariate Q-Q plot with 5% significance level and a variance of initial training are considered as an object function of genetic algorithm in the training normalization process. As a result of normalization of training using the genetic algorithm, it was proved that, for the study area, the mean and variance of each class shifted to the population, and the result showed the possibility of prediction of the distribution of each class.

A numerical study on portfolio VaR forecasting based on conditional copula (조건부 코퓰라를 이용한 포트폴리오 위험 예측에 대한 실증 분석)

  • Kim, Eun-Young;Lee, Tae-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.6
    • /
    • pp.1065-1074
    • /
    • 2011
  • During several decades, many researchers in the field of finance have studied Value at Risk (VaR) to measure the market risk. VaR indicates the worst loss over a target horizon such that there is a low, pre-specified probability that the actual loss will be larger (Jorion, 2006, p.106). In this paper, we compare conditional copula method with two conventional VaR forecasting methods based on simple moving average and exponentially weighted moving average for measuring the risk of the portfolio, consisting of two domestic stock indices. Through real data analysis, we conclude that the conditional copula method can improve the accuracy of portfolio VaR forecasting in the presence of high kurtosis and strong correlation in the data.