• Title/Summary/Keyword: Projection Pursuit Regression

Search Result 11, Processing Time 0.018 seconds

Outlier Identification in Regression Analysis using Projection Pursuit

  • Kim, Hyojung;Park, Chongsun
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.3
    • /
    • pp.633-641
    • /
    • 2000
  • In this paper, we propose a method to identify multiple outliers in regression analysis with only assumption of smoothness on the regression function. Our method uses single-linkage clustering algorithm and Projection Pursuit Regression (PPR). It was compared with existing methods using several simulated and real examples and turned out to be very useful in regression problem with the regression function which is far from linear.

  • PDF

Prediction and Classification Using Projection Pursuit Regression with Automatic Order Selection

  • Park, Heon Jin;Choi, Daewoo;Koo, Ja-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.2
    • /
    • pp.585-596
    • /
    • 2000
  • We developed a macro for prediction and classification using profection pursuit regression based on Friedman (1984b) and Hwang, et al. (1994). In the macro, the order of the Hermite functions can be selected automatically. In projection pursuit regression, we compare several smoothing methods such as super smoothing, smoothing with the Hermite functions. Also, classification methods applied to German credit data are compared.

  • PDF

Projection Pursuit Regression for Binary Responses using Simulated Annealing (모의 담금질을 이용한 이진반응변수 사용추적회귀)

  • 박종선
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.2
    • /
    • pp.321-332
    • /
    • 2001
  • 본 논문에서는 반응변수가 두 가지의 값을 갖는 회귀분석에 적용할 수 있는 사영추적회귀를 고려하였다. 회귀모형에 필요한 설명변수들의 선형결합이 하나이고 연결함수의 형태를 사전에 알지 못한다는 가정하에서 모의담금질 기법을 이용하여 모형에 필요한 선형결합을 찾는 알고리즘을 제시하였다. 이진 반응변수의 경우에는 평활모수의 값에 따라 잔차이탈도함수의 반응표면이 단봉의 형태를 갖지 않는 경우가 있어 비동질적 마코프체인을 이용한 모의담금질 기법을 적용하면 효율적으로 선형결합을 탐색할 수 있다.

  • PDF

Comparison of Variable Importance Measures in Tree-based Classification (나무구조의 분류분석에서 변수 중요도에 대한 고찰)

  • Kim, Na-Young;Lee, Eun-Kyung
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.5
    • /
    • pp.717-729
    • /
    • 2014
  • Projection pursuit classification tree uses a 1-dimensional projection with the view of the most separating classes in each node. These projection coefficients contain information distinguishing two groups of classes from each other and can be used to calculate the importance measure of classification in each variable. This paper reviews the variable importance measure with increasing interest in line with growing data size. We compared the performances of projection pursuit classification tree with those of classification and regression tree(CART) and random forest. Projection pursuit classification tree are found to produce better performance in most cases, particularly with highly correlated variables. The importance measure of projection pursuit classification tree performs slightly better than the importance measure of random forest.

Estimation of Hard-to-Measure Measurements in Anthropometric Surveys

  • Choi, Jong-Hoo;Kim, Ryu-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.1
    • /
    • pp.213-220
    • /
    • 2002
  • Anthropometric survey is important as a basis for human engineering fields. According to our experiences, there are difficulties in obtaining the measurements of some body parts because respondents are reluctant to expose. In order to overcome these difficulties, we propose a method for estimating such hard-to-measure measurements by using easy-to-measure measurements those are closely related to them. Multiple Regression Model, Feedforward Neural Network(FNN) Model and Projection Pursuit Regression(PPR) Model will be used as analytical tools for this purpose. The method we propose will be illustrated with real data from the 1992 Korea national anthropometric survey.

Kernel Adatron Algorithm for Supprot Vector Regression

  • Kyungha Seok;Changha Hwang
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.3
    • /
    • pp.843-848
    • /
    • 1999
  • Support vector machine(SVM) is a new and very promising classification and regression technique developed by Bapnik and his group at AT&T Bell laboratories. However it has failed to establish itself as common machine learning tool. This is partly due to the fact that SVM is not easy to implement and its standard implementation requires the optimization package for quadratic programming. In this paper we present simple iterative Kernl Adatron algorithm for nonparametric regression which is easy to implement and guaranteed to converge to the optimal solution and compare it with neural networks and projection pursuit regression.

  • PDF

회귀분석을 위한 로버스트 신경망

  • 황창하;김상민;박희주
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.2
    • /
    • pp.327-332
    • /
    • 1997
  • 다층 신경망은 비모수 회귀함수 추정의 한 방법이다. 다충 신경망을 학습시키기 위해 역전파 알고리즘이 널리 사용되고 있다. 그러나 이 알고리즘은 이상치에 매우 민감하여 이상치를 포함하고 있는 자료에 대하여 원하지 않는 회귀함수를 추정한다. 본 논문에서는 통계물리에서 자주 사용하는 방법을 이용하여 로버스트 역전파 알고리즘을 제안하고 수학적으로 신경망과 매우 유사한 PRP(projection pursuit regression) 방법, 일반적인 역전파 알고리즘과 모의실험을 통해 비교 분석한다.

  • PDF

Building Regression Models for Tire Design Factors (타이어 설계 인자들에 대한 회귀모형의 수립)

  • Park, Jeong-soo;Hwang, Hyun-sik;Cho, Wan Hyun
    • Journal of Korean Society for Quality Management
    • /
    • v.24 no.3
    • /
    • pp.94-110
    • /
    • 1996
  • Two regression models for explaining the tire performances (especially conering coefficients) by tire design and experimental factors are built. One is the ordinary regression model, and the explaining variables in the model are selected by a stepwise method. The other model is built by a modern nonparametric regression technique, called projection pursuit regression. Then two models are compared and combined, so that the relationship between the tire performances and design factors are well figured out. The optimal experimental design issue and future research ideas are also discussed.

  • PDF

Efficient Score Estimation and Adaptive Rank and M-estimators from Left-Truncated and Right-Censored Data

  • Chul-Ki Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.3
    • /
    • pp.113-123
    • /
    • 1996
  • Data-dependent (adaptive) choice of asymptotically efficient score functions for rank estimators and M-estimators of regression parameters in a linear regression model with left-truncated and right-censored data are developed herein. The locally adaptive smoothing techniques of Muller and Wang (1990) and Uzunogullari and Wang (1992) provide good estimates of the hazard function h and its derivative h' from left-truncated and right-censored data. However, since we need to estimate h'/h for the asymptotically optimal choice of score functions, the naive estimator, which is just a ratio of estimated h' and h, turns out to have a few drawbacks. An altermative method to overcome these shortcomings and also to speed up the algorithms is developed. In particular, we use a subroutine of the PPR (Projection Pursuit Regression) method coded by Friedman and Stuetzle (1981) to find the nonparametric derivative of log(h) for the problem of estimating h'/h.

  • PDF

LMS and LTS-type Alternatives to Classical Principal Component Analysis

  • Huh, Myung-Hoe;Lee, Yong-Goo
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.2
    • /
    • pp.233-241
    • /
    • 2006
  • Classical principal component analysis (PCA) can be formulated as finding the linear subspace that best accommodates multidimensional data points in the sense that the sum of squared residual distances is minimized. As alternatives to such LS (least squares) fitting approach, we produce LMS (least median of squares) and LTS (least trimmed squares)-type PCA by minimizing the median of squared residual distances and the trimmed sum of squares, in a similar fashion to Rousseeuw (1984)'s alternative approaches to LS linear regression. Proposed methods adopt the data-driven optimization algorithm of Croux and Ruiz-Gazen (1996, 2005) that is conceptually simple and computationally practical. Numerical examples are given.