Search | Korea Science

Robust Interpolation Method for Adapting to Sparse Design in Nonparametric Regression (선형보간법에 의한 자료 희소성 해결방안의 문제와 대안)

Park, Dong-Ryeon
- The Korean Journal of Applied Statistics
- /
- v.20 no.3
- /
- pp.561-571
- /
- 2007
Local linear regression estimator is the most widely used nonparametric regression estimator which has a number of advantages over the traditional kernel estimators. It is well known that local linear estimator can produce erratic result in sparse regions in the realization of the design and the interpolation method of Hall and Turlach (1997) is the very efficient way to resolve this problem. However, it has been never pointed out that Hall and Turlach's interpolation method is very sensitive to outliers. In this paper, we propose the robust version of the interpolation method for adapting to sparse design. The finite sample properties of the method is compared with Hall and Turlach's method by the simulation study.
https://doi.org/10.5351/KJAS.2007.20.3.561 인용 PDF KSCI

Penalized quantile regression tree (벌점화 분위수 회귀나무모형에 대한 연구)

Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
- The Korean Journal of Applied Statistics
- /
- v.29 no.7
- /
- pp.1361-1371
- /
- 2016
Quantile regression provides a variety of useful statistical information to examine how covariates influence the conditional quantile functions of a response variable. However, traditional quantile regression (which assume a linear model) is not appropriate when the relationship between the response and the covariates is a nonlinear. It is also necessary to conduct variable selection for high dimensional data or strongly correlated covariates. In this paper, we propose a penalized quantile regression tree model. The split rule of the proposed method is based on residual analysis, which has a negligible bias to select a split variable and reasonable computational cost. A simulation study and real data analysis are presented to demonstrate the satisfactory performance and usefulness of the proposed method.
https://doi.org/10.5351/KJAS.2016.29.7.1361 인용 PDF KSCI

A Bootstrap Test for Linear Relationship by Kernel Smoothing (희귀모형의 선형성에 대한 커널붓스트랩검정)

Baek, Jang-Sun;Kim, Min-Soo
- Journal of the Korean Data and Information Science Society
- /
- v.9 no.2
- /
- pp.95-103
- /
- 1998
Azzalini and Bowman proposed the pseudo-likelihood ratio test for checking the linear relationship using kernel regression estimator when the error of the regression model follows the normal distribution. We modify their method with the bootstrap technique to construct a new test, and examine the power of our test through simulation. Our method can be applied to the case where the distribution of the error is not normal.
PDF

Fitting Distribution of Accident Frequency of Freeway Horizontal Curve Sections & Development of Negative Binomial Regression Models (고속도로 평면선형상 사고빈도분포 추정을 통한 음이항회귀모형 개발 (기하구조요인을 중심으로))

강민욱;도철웅;손봉수
- Journal of Korean Society of Transportation
- /
- v.20 no.7
- /
- pp.197-204
- /
- 2002
교통사고예측 및 예방을 위해서는 실제적으로 도로설계과정에서 제어가 가능한 도로 기하구조요소에 대한 사고관계를 파악함이 타당하다. 즉, 도로의 설계자는 도로건설에 앞서 기하구조요소와 사고와의 관계를 현장자료를 통해 정확히 밝혀 도로설계에 반영해야 한다. 이를 위해, 교통사고의 빈도분포를 박히는 것은 가장 기본이 되는 일이며, 교통사고 예측모형개발에 선행되어야 한다. 일반적으로 교통사고건수의 경우 분산이 평균보다 큰 과분산(overdispersion)의 특징을 가지고 있어 음이항 분포를 따른다고 알려져 있다. 따라서 본 논문은 사고모형의 개발에 앞서, 사고발생지점에 대한 도로설계요소와 기타 잠재적인 사고발생 관련요인이 비교적 잘 파악되어있는 호남고속도로를 중심으로 평면 선형상 곡선부에 대하여 교통사고의 분포를 적합도 검정을 통해 알아보고자 하였다. 사고자료는 한국도로송사의 호남고속도로 5년(1996∼2000)간 자료를 분석에 맞게 정리하였으며, 강민욱과 송봉수(2002)에서 제시한 평면선형에 있어서의 구간분할법을 이용하여 배향곡선구간과 단일곡선구간에 대한 사고분석을 하였다. 적합도 분석결과, 예상대로 음이항분포가 사고건수를 설명하기에 가장 적합한 확률분포로 제시되었으며, 이를 통해 최우추정법을 이용한 음이항회귀모형을 개발하였다. 구간분할법을 적용한 음이항회귀모형의 경우, 기존의 확률회귀토형에 비하여 높은 결정계수를 갖았으며, 모형에서 적용된 기하구조요소로는 차량 노출계수, 곡선반경, 단위거리 당 편경사변화값 등이다.
PDF KSCI

An educational tool for binary logistic regression model using Excel VBA (엑셀 VBA를 이용한 이분형 로지스틱 회귀모형 교육도구 개발)

Park, Cheolyong;Choi, Hyun Seok
- Journal of the Korean Data and Information Science Society
- /
- v.25 no.2
- /
- pp.403-410
- /
- 2014
Binary logistic regression analysis is a statistical technique that explains binary response variable by quantitative or qualitative explanatory variables. In the binary logistic regression model, the probability that the response variable equals, say 1, one of the binary values is to be explained as a transformation of linear combination of explanatory variables. This is one of big barriers that non-statisticians have to overcome in order to understand the model. In this study, an educational tool is developed that explains the need of the binary logistic regression analysis using Excel VBA. More precisely, this tool explains the problems related to modeling the probability of the response variable equal to 1 as a linear combination of explanatory variables and then shows how these problems can be solved through some transformations of the linear combination.
https://doi.org/10.7465/jkdi.2014.25.2.403 인용 PDF KSCI

Locally Weighted Polynomial Forecasting Model (지역가중다항식을 이용한 예측모형)

Mun, Yeong-Il
- Journal of Korea Water Resources Association
- /
- v.33 no.1
- /
- pp.31-38
- /
- 2000
Relationships between hydrologic variables are often nonlinear. Usually the functional form of such a relationship is not known a priori. A multivariate, nonparametric regression methodology is provided here for approximating the underlying regression function using locally weighted polynomials. Locally weighted polynomials consider the approximation of the target function through a Taylor series expansion of the function in the neighborhood of the point of estimate. The utility of this nonparametric regression approach is demonstrated through an application to nonparametric short term forecasts of the biweekly Great Salt Lake volume.volume.
PDF

A study on the multivariate sliced inverse regression (다변량 분할 역회귀모형에 관한 연구)

이용구;이덕기
- The Korean Journal of Applied Statistics
- /
- v.10 no.2
- /
- pp.293-308
- /
- 1997
Sliced inverse regression is a method for reducing the dimension of the explanatory variable X without going through any parametric or nonparametric model fitting process. This method explores the simplicity of the inverse view of regression; that is, instead of regressing the univariate output varable y against the multivariate X, we regress X against y. In this article, we propose bivariate sliced inverse regression, whose method regress the multivariate X against the bivariate output variables $y_1, Y_2$. Bivariate sliced inverse regression estimates the e.d.r. directions of satisfying two generalized regression model simultaneously. For the application of bivariate sliced inverse regression, we decompose the output variable y into two variables, one variable y gained by projecting the output variable y onto the column space of X and the other variable r through projecting the output variable y onto the space orthogonal to the column space of X, respectively and then estimate the e.d.r. directions of the generalized regression model by utilize two variables simultaneously. As a result, bivariate sliced inverse regression of considering the variable y and r simultaneously estimates the e.d.r. directions efficiently and steadily when the regression model is linear, quadratic and nonlinear, respectively.
PDF

Characteristics and Models of the Side-swipe Accident in the Case of Cheongju 4-legged Signalized Intersections (4지 신호교차로의 측면접촉사고 특성 및 사고모형 - 청주시를 사례로 -)

Park, Sang-Hyuk;Kim, Tae-Young;Park, Byung-Ho
- International Journal of Highway Engineering
- /
- v.11 no.4
- /
- pp.41-47
- /
- 2009
This study deals with the side-swipe accidents of 4-legged signalized intersections in Cheongju. The objectives are to analyze the characteristics of the accidents and to develop the related models. In pursuing the above, this study gives particular emphasis to finding the appropriate methodology to modelling. The main results are as follows. First, injuries were analyzed to be twice than property-only accidents in the side-swipe accidents. The accidents were evaluated to occur more in inside-intersection. Also, the accidents were analyzed to be almost the auto-related accidents and to be occurred by the unsafely-driving activity. Second, multiple linear regression models were evaluated to be more statistically significant than multiple non-linear. The most fitted models were analyzed to be the models with the number of accidents as the dependent variable. The factors of side-swipe accidents analyzed in this study were ADT, area of intersection, right-turn-only-lane, number of pedestrian crossings, limited speed of main road, maximum grade and number of signal phase.
PDF

Comparison of Data-based Real-Time Flood Forecasting Model (자료기반 실시간 홍수예측 모형의 비교·검토)

Choi, Hyun Gu;Han, Kun Yeun;Roh, Hong Sik;Park, Se Jin
- KSCE Journal of Civil and Environmental Engineering Research
- /
- v.33 no.5
- /
- pp.1809-1827
- /
- 2013
Recently we need to take various measures to prepare for extreme flood that occur due to climate change. It is important that establish flood forecasting system to prepare flood over non-structure measures. The objective of this study is to develop superior real-time flood forecasting model by comparing the Neuro-fuzzy model and the multiple linear regression model. The Neuro-fuzzy model and the multiple linear regression model are established using same input data and applied for various flood events in Nakdong basin. The results show that the Neuro-fuzzy model can carry out flood forecasting results more accurately than the multiple linear regression model. This study can contribute to the establishment of a high accuracy flood information system that secure lead time in Nakdong basin.
https://doi.org/10.12652/Ksce.2013.33.5.1809 인용 PDF KSCI

Non-linear regression model considering all association thresholds for decision of association rule numbers (기본적인 연관평가기준 전부를 고려한 비선형 회귀모형에 의한 연관성 규칙 수의 결정)

Park, Hee Chang
- Journal of the Korean Data and Information Science Society
- /
- v.24 no.2
- /
- pp.267-275
- /
- 2013
Among data mining techniques, the association rule is the most recently developed technique, and it finds the relevance between two items in a large database. And it is directly applied in the field because it clearly quantifies the relationship between two or more items. When we determine whether an association rule is meaningful, we utilize interestingness measures such as support, confidence, and lift. Interestingness measures are meaningful in that it shows the causes for pruning uninteresting rules statistically or logically. But the criteria of these measures are chosen by experiences, and the number of useful rules is hard to estimate. If too many rules are generated, we cannot effectively extract the useful rules.In this paper, we designed a variety of non-linear regression equations considering all association thresholds between the number of rules and three interestingness measures. And then we diagnosed multi-collinearity and autocorrelation problems, and used analysis of variance results and adjusted coefficients of determination for the best model through numerical experiments.
https://doi.org/10.7465/jkdi.2013.24.2.267 인용 PDF KSCI

Search Result 271, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)