• Title/Summary/Keyword: Regression Statistical Analysis

Search Result 3,392, Processing Time 0.028 seconds

Imputation Procedures in Weibull Regression Analysis in the presence of missing values

  • Kim Soon-kwi;Jeong Bong-Bin
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2001.11a
    • /
    • pp.143-148
    • /
    • 2001
  • A dataset having missing observations is often completed by using imputed values. In this paper the performances and accuracy of complete case methods and four imputation procedures are evaluated when missing values exist only on the response variables in the Weibull regression model. Our simulation results show that compared to other imputation procedures, in particular, hotdeck and Weibull regression imputation procedure can be well used to compensate for missing data. In addition an illustrative real data is given.

  • PDF

인구추계 데이터의 이상점과 통계적 분석

  • Kim, Jong-Tae;Seo, Hyo-Min
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2009.05a
    • /
    • pp.153-159
    • /
    • 2009
  • The purpose of this paper is to suggest the problems of basic population data(1960-2005) and the data(2006-2050) of population projections reported by Korean National Statistical Office in November 2006. The errors on the basic population data can be easily checked by using the graphical analysis and the method of linear regression analysis. It is necessary to revise the population projections reported by Korean National Statistical Office.

  • PDF

Bayesian Analysis in Generalized Log-Gamma Censored Regression Model

  • Younshik chung;Yoomi Kang
    • Communications for Statistical Applications and Methods
    • /
    • v.5 no.3
    • /
    • pp.733-742
    • /
    • 1998
  • For industrial and medical lifetime data, the generalized log-gamma regression model is considered. Then the Bayesian analysis for the generalized log-gamma regression with censored data are explained and following the data augmentation (Tanner and Wang; 1987), the censored data is replaced by simulated data. To overcome the complicated Bayesian computation, Makov Chain Monte Carlo (MCMC) method is employed. Then some modified algorithms are proposed to implement MCMC. Finally, one example is presented.

  • PDF

Regression and Correlation Analysis via Dynamic Graphs

  • Kang, Hee Mo;Sim, Songyong
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.695-705
    • /
    • 2003
  • In this article, we propose a regression and correlation analysis via dynamic graphs and implement them in Java Web Start. For the polynomial relations between dependent and independent variables, dynamic graphics are implemented for both polynomial regression and spline estimates for an instant model selection. The results include basic statistics. They are available both as a web-based service and an application.

Analysis of the Effect of Wind Direction on Ozone Level

  • Na, Jong-Hwa;Sung, Su-Jin;Yu, Hye-Kyung
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.527-536
    • /
    • 2012
  • In this paper we analyze the effect of circular variables such as wind direction, time and month on the ozone level. In particular, we examined the effect of wind direction by exploratory data analysis methods and provide the correlation and regression analyzes in the cases including all circular explanatory variables. In the analysis, we convert time and month variables to circular variables and analyze the effect of these variables on regression analysis; in addition, we also consider circular-circular regression. We used weather condition and air pollution data collected from Dongdaemoon district of Seoul in 2007.

A comparison of Multilayer Perceptron with Logistic Regression for the Risk Factor Analysis of Type 2 Diabetes Mellitus (제2형 당뇨병의 위험인자 분석을 위한 다층 퍼셉트론과 로지스틱 회귀 모델의 비교)

  • 서혜숙;최진욱;이홍규
    • Journal of Biomedical Engineering Research
    • /
    • v.22 no.4
    • /
    • pp.369-375
    • /
    • 2001
  • The statistical regression model is one of the most frequently used clinical analysis methods. It has basic assumption of linearity, additivity and normal distribution of data. However, most of biological data in medical field are nonlinear and unevenly distributed. To overcome the discrepancy between the basic assumption of statistical model and actual biological data, we propose a new analytical method based on artificial neural network. The newly developed multilayer perceptron(MLP) is trained with 120 data set (60 normal, 60 patient). On applying test data, it shows the discrimination power of 0.76. The diabetic risk factors were also identified from the MLP neural network model and the logistic regression model. The signigicant risk factors identified by MLP model were post prandial glucose level(PP2), sex(male), fasting blood sugar(FBS) level, age, SBP, AC and WHR. Those from the regression model are sex(male), PP2, age and FBS. The combined risk factors can be identified using the MLP model. Those are total cholesterol and body weight, which is consistent with the result of other clinical studies. From this experiment we have learned that MLP can be applied to the combined risk factor analysis of biological data which can not be provided by the conventional statistical method.

  • PDF

RS-based method for estimating statistical moments and its application to reliability analysis (반응표면을 활용한 통계적 모멘트 추정 방법과 신뢰도해석에 적용)

  • Huh, Jae-Sung;Kwak, Byung-Man
    • Proceedings of the KSME Conference
    • /
    • 2004.11a
    • /
    • pp.852-857
    • /
    • 2004
  • A new and efficient method for estimating the statistical moments of a system performance function has been developed. The method consists of two steps: (1) An approximate response surface is generated by a quadratic regression model, and (2) the statistical moments of the regression model are then calculated by experimental design techniques proposed by Seo and $Kwak^{(4)}$. In this approach, the size of experimental region affects the accuracy of the statistical moments. Therefore, the region size should be selected suitably. The D-optimal design and the central composite design are adopted over the selected experimental region for the regression model. Finally, the Pearson system is adopted to decide the distribution type of the system performance function and to analyze structural reliability.

  • PDF

Development of Discriminant Analysis System by Graphical User Interface of Visual Basic

  • Lee, Yong-Kyun;Shin, Young-Jae;Cha, Kyung-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.447-456
    • /
    • 2007
  • Recently, the multivariate statistical analysis has been used to analyze meaningful information for various data. In this paper, we develope the multivariate statistical analysis system combined with Fisher discriminant analysis, logistic regression, neural network, and decision tree using visual basic 6.0.

  • PDF

Wage Determinants Analysis by Quantile Regression Tree

  • Chang, Young-Jae
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.2
    • /
    • pp.293-301
    • /
    • 2012
  • Quantile regression proposed by Koenker and Bassett (1978) is a statistical technique that estimates conditional quantiles. The advantage of using quantile regression is the robustness in response to large outliers compared to ordinary least squares(OLS) regression. A regression tree approach has been applied to OLS problems to fit flexible models. Loh (2002) proposed the GUIDE algorithm that has a negligible selection bias and relatively low computational cost. Quantile regression can be regarded as an analogue of OLS, therefore it can also be applied to GUIDE regression tree method. Chaudhuri and Loh (2002) proposed a nonparametric quantile regression method that blends key features of piecewise polynomial quantile regression and tree-structured regression based on adaptive recursive partitioning. Lee and Lee (2006) investigated wage determinants in the Korean labor market using the Korean Labor and Income Panel Study(KLIPS). Following Lee and Lee, we fit three kinds of quantile regression tree models to KLIPS data with respect to the quantiles, 0.05, 0.2, 0.5, 0.8, and 0.95. Among the three models, multiple linear piecewise quantile regression model forms the shortest tree structure, while the piecewise constant quantile regression model has a deeper tree structure with more terminal nodes in general. Age, gender, marriage status, and education seem to be the determinants of the wage level throughout the quantiles; in addition, education experience appears as the important determinant of the wage level in the highly paid group.

Fused inverse regression with multi-dimensional responses

  • Cho, Youyoung;Han, Hyoseon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.267-279
    • /
    • 2021
  • A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.