• Title/Summary/Keyword: Variable Statistics

Search Result 1,353, Processing Time 0.023 seconds

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

  • Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.149-161
    • /
    • 2019
  • In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.

Economic Design of $\bar{X}$ Control Chart Using a Surrogate Variable (대용변수를 이용한 $\bar{X}$ 관리도의 경제적 설계)

  • Lee, Tae-Hoon;Lee, Jae-Hoon;Lee, Min-Koo;Lee, Joo-Ho
    • Journal of Korean Society for Quality Management
    • /
    • v.37 no.2
    • /
    • pp.46-57
    • /
    • 2009
  • The traditional approach to economic design of control charts is based on the assumption that a process is monitored using a performance variable. However, various types of automatic test equipments recently introduced as a part of factory automation usually measure surrogate variables instead of performance variables that are costly to measure. In this article we propose a model for economic design of a control chart which uses a surrogate variable that is highly correlated with the performance variable. The optimum values of the design parameters are determined by maximizing the total average income per cycle time. Numerical studies are performed to compare the proposed $\bar{X}$ control charts with the traditional model using the examples in Panagos et al. (1985).

Correlated variable importance for random forests (랜덤포레스트를 위한 상관예측변수 중요도)

  • Shin, Seung Beom;Cho, Hyung Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.177-190
    • /
    • 2021
  • Random forests is a popular method that improves the instability and accuracy of decision trees by ensembles. In contrast to increasing the accuracy, the ease of interpretation is sacrificed; hence, to compensate for this, variable importance is provided. The variable importance indicates which variable plays a role more importantly in constructing the random forests. However, when a predictor is correlated with other predictors, the variable importance of the existing importance algorithm may be distorted. The downward bias of correlated predictors may reduce the importance of truly important predictors. We propose a new algorithm remedying the downward bias of correlated predictors. The performance of the proposed algorithm is demonstrated by the simulated data and illustrated by the real data.

On Estimating of Kullback-Leibler Information Function using Three Step Stress Accelerated Life Test

  • Park, Byung-Gu;Yoon, Sang-Chul;Cho, Ji-Young
    • International Journal of Reliability and Applications
    • /
    • v.1 no.2
    • /
    • pp.155-165
    • /
    • 2000
  • In this paper, we propose some estimators of Kullback- Leibler Information functions using the data from three step stress accelerated life tests. This acceleration model is assumed to be a tampered random variable model. Some asymptotic properties of proposed estimators are proved. Simulations are performed for comparing the small sample properties of the proposed estimators under use condition of accelerated life test.

  • PDF

Biplots of Multivariate Data Guided by Linear and/or Logistic Regression

  • Huh, Myung-Hoe;Lee, Yonggoo
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.2
    • /
    • pp.129-136
    • /
    • 2013
  • Linear regression is the most basic statistical model for exploring the relationship between a numerical response variable and several explanatory variables. Logistic regression secures the role of linear regression for the dichotomous response variable. In this paper, we propose a biplot-type display of the multivariate data guided by the linear regression and/or the logistic regression. The figures show the directional flow of the response variable as well as the interrelationship of explanatory variables.

Estimation of Median in the Presence of Three Known Quartiles of an Auxiliary Variable

  • Singh, Housila P.;Shanmugam, Ramalingam;Singh, Sarjinder;Kim, Jong-Min
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.5
    • /
    • pp.363-386
    • /
    • 2014
  • This paper has improved several ratio type estimators of the population median including their generalization in the presence of three known quartiles of an auxiliary variable. The properties of the improved estimators are discussed and applied. Both the empirical and simulation studies confirm that our new estimators perform efficiently.

SUBNORMALITY OF S2(a, b, c, d) AND ITS BERGER MEASURE

  • Duan, Yongjiang;Ni, Jiaqi
    • Bulletin of the Korean Mathematical Society
    • /
    • v.53 no.3
    • /
    • pp.943-957
    • /
    • 2016
  • We introduce a 2-variable weighted shift, denoted by $S_2$(a, b, c, d), which arises naturally from analytic function space theory. We investigate when it is subnormal, and compute the Berger measure of it when it is subnormal. And we apply the results to investigate the relationship among 2-variable subnormal, hyponormal and 2-hyponormal weighted shifts.

Properties of variable sampling interval control charts

  • Chang, Duk-Joon;Heo, Sun-Yeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.4
    • /
    • pp.819-829
    • /
    • 2010
  • Properties of multivariate variable sampling interval (VSI) Shewhart and CUSUM charts for monitoring mean vector of related quality variables are investigated. To evaluate average time to signal (ATS) and average number of switches (ANSW) of the proposed charts, Markov chain approaches and simulations are applied. Performances of the proposed charts are also investigated both when the process is in-control and when it is out-of-control.

Variable Selection Theorems in General Linear Model

  • Park, Jeong-Soo;Yoon, Sang-Hoo
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.171-179
    • /
    • 2006
  • For the problem of variable selection in linear models, we consider the errors are correlated with V covariance matrix. Hocking's theorems on the effects of the overfitting and the underfitting in linear model are extended to the less than full rank and correlated error model, and to the ANCOVA model.

  • PDF