• Title/Summary/Keyword: Empirical distribution plot

Search Result 8, Processing Time 0.024 seconds

Multivariate empirical distribution plot and goodness-of-fit test (다변량 경험분포그림과 적합도 검정)

  • Hong, Chong Sun;Park, Yongho;Park, Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.4
    • /
    • pp.579-590
    • /
    • 2017
  • The multivariate empirical distribution function could be defined when its distribution function can be estimated. It is known that bivariate empirical distribution functions could be visualized by using Step plot and Quantile plot. In this paper, the multivariate empirical distribution plot is proposed to represent the multivariate empirical distribution function on the unit square. Based on many kinds of empirical distribution plots corresponding to various multivariate normal distributions and other specific distributions, it is found that the empirical distribution plot also depends sensitively on its distribution function and correlation coefficients. Hence, we could suggest five goodness-of-fit test statistics. These critical values are obtained by Monte Carlo simulation. We explore that these critical values are not much different from those in text books. Therefore, we may conclude that the proposed test statistics in this work would be used with known critical values with ease.

Goodness-of-fit Test for the Weibull Distribution Based on Multiply Type-II Censored Samples

  • Kang, Suk-Bok;Han, Jun-Tae
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.2
    • /
    • pp.349-361
    • /
    • 2009
  • In this paper, we derive the approximate maximum likelihood estimators of the shape parameter and the scale parameter in a Weibull distribution under multiply Type-II censoring by the approximate maximum likelihood estimation method. We develop three modified empirical distribution function type tests for the Weibull distribution based on multiply Type-II censored samples. We also propose modified normalized sample Lorenz curve plot and new test statistic.

Goodness-of-fit tests for the inverse Weibull or extreme value distribution based on multiply type-II censored samples

  • Kang, Suk-Bok;Han, Jun-Tae;Seo, Yeon-Ju;Jeong, Jina
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.903-914
    • /
    • 2014
  • The inverse Weibull distribution has been proposed as a model in the analysis of life testing data. Also, inverse Weibull distribution has been recently derived as a suitable model to describe degradation phenomena of mechanical components such as the dynamic components (pistons, crankshaft, etc.) of diesel engines. In this paper, we derive the approximate maximum likelihood estimators of the scale parameter and the shape parameter in the inverse Weibull distribution under multiply type-II censoring. We also develop four modified empirical distribution function (EDF) type tests for the inverse Weibull or extreme value distribution based on multiply type-II censored samples. We also propose modified normalized sample Lorenz curve plot and new test statistic.

Goodness-of-fit test for the logistic distribution based on multiply type-II censored samples

  • Kang, Suk-Bok;Han, Jun-Tae;Cho, Young-Seuk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.195-209
    • /
    • 2014
  • In this paper, we derive the estimators of the location parameter and the scale parameter in a logistic distribution based on multiply type-II censored samples by the approximate maximum likelihood estimation method. We use four modified empirical distribution function (EDF) types test for the logistic distribution based on multiply type-II censored samples using proposed approximate maximum likelihood estimators. We also propose the modified normalized sample Lorenz curve plot for the logistic distribution based on multiply type-II censored samples. For each test, Monte Carlo techniques are used to generate the critical values. The powers of these tests are also investigated under several alternative distributions.

An Efficiency Assessment for Reflectance Normalization of RapidEye Employing BRD Components of Wide-Swath satellite

  • Kim, Sang-Il;Han, Kyung-Soo;Yeom, Jong-Min
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.3
    • /
    • pp.303-314
    • /
    • 2011
  • Surface albedo is an important parameter of the surface energy budget, and its accurate quantification is of major interest to the global climate modeling community. Therefore, in this paper, we consider the direct solution of kernel based bidirectional reflectance distribution function (BRDF) models for retrieval of normalized reflectance of high resolution satellite. The BRD effects can be seen in satellite data having a wide swath such as SPOT/VGT (VEGETATION) have sufficient angular sampling, but high resolution satellites are impossible to obtain sufficient angular sampling over a pixel during short period because of their narrow swath scanning when applying semi-empirical model. This gives a difficulty to run BRDF model inferring the reflectance normalization of high resolution satellites. The principal purpose of the study is to estimate normalized reflectance of high resolution satellite (RapidEye) through BRDF components from SPOT/VGT. We use semi-empirical BRDF model to estimated BRDF components from SPOT/VGT and reflectance normalization of RapidEye. This study used SPOT/VGT satellite data acquired in the S1 (daily) data, and within this study is the multispectral sensor RapidEye. Isotropic value such as the normalized reflectance was closely related to the BRDF parameters and the kernels. Also, we show scatter plot of the SPOT/VGT and RapidEye isotropic value relationship. The linear relationship between the two linear regression analysis is performed by using the parameters of SPOTNGT like as isotropic value, geometric value and volumetric scattering value, and the kernel values of RapidEye like as geometric and volumetric scattering kernel Because BRDF parameters are difficult to directly calculate from high resolution satellites, we use to BRDF parameter of SPOT/VGT. Also, we make a decision of weighting for geometric value, volumetric scattering value and error through regression models. As a result, the weighting through linear regression analysis produced good agreement. For all sites, the SPOT/VGT isotropic and RapidEye isotropic values had the high correlation (RMSE, bias), and generally are very consistent.

Characteristics of Growth and Development of Empirical Stand Yield Model on Pinus densiflora in Central Korea (중부지방소나무의 생장특성 및 경험적 임분수확모델 개발)

  • Jeon, Ju Hyeon;Son, Yeong Mo;Kang, Jin Taek
    • Journal of Korean Society of Forest Science
    • /
    • v.106 no.2
    • /
    • pp.267-273
    • /
    • 2017
  • This study was conducted to construct a empirical yield table for Pinus densiflora in real forest. Since existing normal yield tables have been derived by studying and analyzing communities in ideal environment for tree growth, those tables provide more over-estimated values than ones from real forest. Because of this, there are some difficulties to apply the tables to empirical forest except for normal forest. In this study, therefore, we estimated stand growth for real forest on P. densiflora as the representative species of conifers. We used 1,957 sample plot data of P. densiflora in central Korea from National Forest Inventory (NFI) system, and analyzed through estimation, recovery and prediction in order by using Weibull function as a diameter distribution model. Weilbull and Schumacher models were applied for estimating mean DBH and mean basel area and it was found that the site index for P. densiflora in central Korea ranges from 8 to 14 at reference age 30. According to site 12 in the stand yield table, the Mean Annual Increment (MAI) of P. densiflora was $4.42m^3/ha$ at 30 years of age. Compared to existing volume table constructed before, it is showed that MAI of this study were lower. According to the paired t-test that is conducted with the gap of volume values between normal forest and real forest by site index and age, the P-value was less than 0.001 which is recognized to have a statistically significant difference. Based on the results in this study, it is considered to be helpful for practical management and management policy on P. densiflora in central Korea.

Factor Analysis for Exploratory Research in the Distribution Science Field (유통과학분야에서 탐색적 연구를 위한 요인분석)

  • Yim, Myung-Seong
    • Journal of Distribution Science
    • /
    • v.13 no.9
    • /
    • pp.103-112
    • /
    • 2015
  • Purpose - This paper aims to provide a step-by-step approach to factor analytic procedures, such as principal component analysis (PCA) and exploratory factor analysis (EFA), and to offer a guideline for factor analysis. Authors have argued that the results of PCA and EFA are substantially similar. Additionally, they assert that PCA is a more appropriate technique for factor analysis because PCA produces easily interpreted results that are likely to be the basis of better decisions. For these reasons, many researchers have used PCA as a technique instead of EFA. However, these techniques are clearly different. PCA should be used for data reduction. On the other hand, EFA has been tailored to identify any underlying factor structure, a set of measured variables that cause the manifest variables to covary. Thus, it is needed for a guideline and for procedures to use in factor analysis. To date, however, these two techniques have been indiscriminately misused. Research design, data, and methodology - This research conducted a literature review. For this, we summarized the meaningful and consistent arguments and drew up guidelines and suggested procedures for rigorous EFA. Results - PCA can be used instead of common factor analysis when all measured variables have high communality. However, common factor analysis is recommended for EFA. First, researchers should evaluate the sample size and check for sampling adequacy before conducting factor analysis. If these conditions are not satisfied, then the next steps cannot be followed. Sample size must be at least 100 with communality above 0.5 and a minimum subject to item ratio of at least 5:1, with a minimum of five items in EFA. Next, Bartlett's sphericity test and the Kaiser-Mayer-Olkin (KMO) measure should be assessed for sampling adequacy. The chi-square value for Bartlett's test should be significant. In addition, a KMO of more than 0.8 is recommended. The next step is to conduct a factor analysis. The analysis is composed of three stages. The first stage determines a rotation technique. Generally, ML or PAF will suggest to researchers the best results. Selection of one of the two techniques heavily hinges on data normality. ML requires normally distributed data; on the other hand, PAF does not. The second step is associated with determining the number of factors to retain in the EFA. The best way to determine the number of factors to retain is to apply three methods including eigenvalues greater than 1.0, the scree plot test, and the variance extracted. The last step is to select one of two rotation methods: orthogonal or oblique. If the research suggests some variables that are correlated to each other, then the oblique method should be selected for factor rotation because the method assumes all factors are correlated in the research. If not, the orthogonal method is possible for factor rotation. Conclusions - Recommendations are offered for the best factor analytic practice for empirical research.

Who Gets Government SME R&D Subsidy? Application of Gradient Boosting Model (Gradient Boosting 모형을 이용한 중소기업 R&D 지원금 결정요인 분석)

  • Kang, Sung Won;Kang, HeeChan
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.4
    • /
    • pp.77-109
    • /
    • 2020
  • In this paper, we build a gradient Boosting model to predict government SME R&D subsidy, select features of high importance, and measure the impact of each features to the predicted subsidy using PDP and SHAP value. Unlike previous empirical researches, we focus on the effect of the R&D subsidy distribution pattern to the incentive of the firms participating subsidy competition. We used the firm data constructed by KISTEP linking government R&D subsidy record with financial statements provided by NICE, and applied a Gradient Boosting model to predict R&D subsidy. We found that firms with higher R&D performance and larger R&D investment tend to have higher R&D subsidies, but firms with higher operation profit or total asset turnover rate tend to have lower R&D subsidies. Our results suggest that current government R&D subsidy distribution pattern provides incentive to improve R&D project performance, but not business performance.