• Title/Summary/Keyword: regression statistics

Search Result 5,255, Processing Time 0.024 seconds

Logistic Regression Classification by Principal Component Selection

  • Kim, Kiho;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.1
    • /
    • pp.61-68
    • /
    • 2014
  • We propose binary classification methods by modifying logistic regression classification. We use variable selection procedures instead of original variables to select the principal components. We describe the resulting classifiers and discuss their properties. The performance of our proposals are illustrated numerically and compared with other existing classification methods using synthetic and real datasets.

Numerical Investigations in Choosing the Number of Principal Components in Principal Component Regression - CASE I

  • Shin, Jae-Kyoung;Moon, Sung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.8 no.2
    • /
    • pp.127-134
    • /
    • 1997
  • A method is proposed for the choice of the number of principal components in principal component regression based on the predicted error sum of squares. To do this, we approximately evaluate that statistic using a linear approximation based on the perturbation expansion. In this paper, we apply the proposed method to various data sets and discuss some properties in choosing the number of principal components in principal component regression.

  • PDF

Detection of Change-Points by Local Linear Regression Fit;

  • Kim, Jong Tae;Choi, Hyemi;Huh, Jib
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.1
    • /
    • pp.31-38
    • /
    • 2003
  • A simple method is proposed to detect the number of change points and test the location and size of multiple change points with jump discontinuities in an otherwise smooth regression model. The proposed estimators are based on a local linear regression fit by the comparison of left and right one-side kernel smoother. Our proposed methodology is explained and applied to real data and simulated data.

Asymptotically Efficient L-Estimation for Regression Slope When Trimming is Given (절사가 주어질때 회귀기울기의 점근적 최량 L-추정법)

  • Sang Moon Han
    • The Korean Journal of Applied Statistics
    • /
    • v.7 no.2
    • /
    • pp.173-182
    • /
    • 1994
  • By applying slope estimator under the arbitrary error distributions proposed by Han(1993), if we define regression quantiles to give upper and lower trimming part and blocks of data, we show the proposed slope estimator has asymptotically efficient slope estimator when the number of regression quantiles to from blocks of data goes to sufficiently large.

  • PDF

Parameter Estimation and Prediction for NHPP Software Reliability Model and Time Series Regression in Software Failure Data

  • Song, Kwang-Yoon;Chang, In-Hong
    • Journal of Integrative Natural Science
    • /
    • v.7 no.1
    • /
    • pp.67-73
    • /
    • 2014
  • We consider the mean value function for NHPP software reliability model and time series regression model in software failure data. We estimate parameters for the proposed models from two data sets. The values of SSE and MSE is presented from two data sets. We compare the predicted number of faults with the actual two data sets using the mean value function and regression curve.

An Outlier Detection Method in Penalized Spline Regression Models (벌점 스플라인 회귀모형에서의 이상치 탐지방법)

  • Seo, Han Son;Song, Ji Eun;Yoon, Min
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.4
    • /
    • pp.687-696
    • /
    • 2013
  • The detection and the examination of outliers are important parts of data analysis because some outliers in the data may have a detrimental effect on statistical analysis. Outlier detection methods have been discussed by many authors. In this article, we propose to apply Hadi and Simonoff's (1993) method to penalized spline a regression model to detect multiple outliers. Simulated data sets and real data sets are used to illustrate and compare the proposed procedure to a penalized spline regression and a robust penalized spline regression.

Comparison of Bias Correction Methods for the Rare Event Logistic Regression (희귀 사건 로지스틱 회귀분석을 위한 편의 수정 방법 비교 연구)

  • Kim, Hyungwoo;Ko, Taeseok;Park, No-Wook;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.2
    • /
    • pp.277-290
    • /
    • 2014
  • We analyzed binary landslide data from the Boeun area with logistic regression. Since the number of landslide occurrences is only 9 out of 5000 observations, this can be regarded as a rare event data. The main issue of logistic regression with the rare event data is a serious bias problem in regression coefficient estimates. Two bias correction methods were proposed before and we quantitatively compared them via simulation. Firth (1993)'s approach outperformed and provided the most stable results for analyzing the rare-event binary data.

A Comparison Study of Survival Regression Models Based on Data Depths (뎁스를 이용한 생존회귀모형들의 비교연구)

  • Kim, Jee-Yun;Hwang, Jin-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.2
    • /
    • pp.313-322
    • /
    • 2007
  • Several robust censored depth regression methods are compared under contamination. Park and Hwang(2003) suggested a way to circumvent the censoring issue by incorporating Kaplan-Meier type weight in halfspace regression depth and Park(2003) used a similar technique to simplicial regression depth. Hubert et al. (2001) suggested a high breakdown point regression depth based on projection called rcent. A new method to implement censoring in rcent is suggested and compared with two precedents under various contamination and censoring schemes.

A Study on Detection of Influential Observations on A Subset of Regression Parameters in Multiple Regression

  • Park, Sung Hyun;Oh, Jin Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.2
    • /
    • pp.521-531
    • /
    • 2002
  • Various diagnostic techniques for identifying influential observations are mostly based on the deletion of a single observation. While such techniques can satisfactorily identify influential observations in many cases, they will not always be successful because of some mask effect. It is necessary, therefore, to develop techniques that examine the potentially influential effects of a subset of observations. The partial regression plots can be used to examine an influential observation for a single parameter in multiple linear regression. However, it is often desirable to detect influential observations for a subset of regression parameters when interest centers on a selected subset of independent variables. Thus, we propose a diagnostic measure which deals with detecting influential observations on a subset of regression parameters. In this paper, we propose a measure M, which can be effectively used for the detection of influential observations on a subset of regression parameters in multiple linear regression. An illustrated example is given to show how we can use the new measure M to identify influential observations on a subset of regression parameters.

Restricted support vector quantile regression without crossing

  • Shim, Joo-Yong;Lee, Jang-Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.6
    • /
    • pp.1319-1325
    • /
    • 2010
  • Quantile regression provides a more complete statistical analysis of the stochastic relationships among random variables. Sometimes quantile functions estimated at different orders can cross each other. We propose a new non-crossing quantile regression method applying support vector median regression to restricted regression quantile, restricted support vector quantile regression. The proposed method provides a satisfying solution to estimating non-crossing quantile functions when multiple quantiles for high dimensional data are needed. We also present the model selection method that employs cross validation techniques for choosing the parameters which aect the performance of the proposed method. One real example and a simulated example are provided to show the usefulness of the proposed method.