통합 검색 | Korea Science

Procedures for Detecting Multiple Outliers in Linear Regression Using R

Kwon, Soon-Sun;Lee, Gwi-Hyun;Park, Sung-Hyun
- 한국통계학회:학술대회논문집
- /
- 한국통계학회 2005년도 추계 학술발표회 논문집
- /
- pp.13-17
- /
- 2005
In recent years, many people use R as a statistics system. R is frequently updated by many R project teams. We are interested in the method of multiple outlier detection and know that R is not supplied the method of multiple outlier detection. In this talk, we review these procedures for detecting multiple outliers and provide more efficient procedures combined with direct methods and indirect methods using R.
PDF

MULTIPLE OUTLIER DETECTION IN LOGISTIC REGRESSION BY USING INFLUENCE MATRIX

Lee, Gwi-Hyun;Park, Sung-Hyun
- Journal of the Korean Statistical Society
- /
- 제36권4호
- /
- pp.457-469
- /
- 2007
Many procedures are available to identify a single outlier or an isolated influential point in linear regression and logistic regression. But the detection of influential points or multiple outliers is more difficult, owing to masking and swamping problems. The multiple outlier detection methods for logistic regression have not been studied from the points of direct procedure yet. In this paper we consider the direct methods for logistic regression by extending the $Pe\tilde{n}a$ and Yohai (1995) influence matrix algorithm. We define the influence matrix in logistic regression by using Cook's distance in logistic regression, and test multiple outliers by using the mean shift model. To show accuracy of the proposed multiple outlier detection algorithm, we simulate artificial data including multiple outliers with masking and swamping.
PDF KSCI

A Comparison of Methods for the Detection of Outliers in Multivariate Data

Hadi, Ali-S.;Joo, Hye-Seon;Son, Mun-S.
- Communications for Statistical Applications and Methods
- /
- 제3권2호
- /
- pp.53-67
- /
- 1996
Numerous classical as well as robust methods have been proposed in the literature for the detection of multiple outlier in multivariate data. The effectiveness and power of each of these methods have not been thoroughly investigated. In this paper we first reduce the vast number of outlier detection methods to a small number of viable ones. This reduction is based on previous work of other researches and on some theoretical arguments. Then we design and implement a Monte Carlo experiment for comparing these methods. The main goal of our study is to determine which methods are most powerful in the detection of multiple outlier and in dealing with the masking and swamping problems. The results of the Monte Carlo study indicate that two of the methods seem to hace better performances than the others for the detection of multiple outlier in multivariate data.
PDF

신경망 이론과 Mahalanobis Distance 이상치 탐색방법을 이용한 고강도 콘크리트 강도 예측 모델 개발에 관한 연구 (Modeling of Strength of High Performance Concrete with Artificial Neural Network and Mahalanobis Distance Outlier Detection Method)

홍정의
- 산업경영시스템학회지
- /
- 제33권4호
- /
- pp.122-129
- /
- 2010
High-performance concrete (HPC) is a new terminology used in concrete construction industry. Several studies have shown that concrete strength development is determined not only by the water-to-cement ratio but also influenced by the content of other concrete ingredients. HPC is a highly complex material, which makes modeling its behavior a very difficult task. This paper aimed at demonstrating the possibilities of adapting artificial neural network (ANN) to predict the comprresive strength of HPC. Mahalanobis Distance (MD) outlier detection method used for the purpose increase prediction ability of ANN. The detailed procedure of calculating Mahalanobis Distance (MD) is described. The effects of outlier compared with before and after artificial neural network training. MD outlier detection method successfully removed existence of outlier and improved the neural network training and prediction performance.
PDF KSCI

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
- Communications for Statistical Applications and Methods
- /
- 제26권2호
- /
- pp.149-161
- /
- 2019
In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.
https://doi.org/10.29220/CSAM.2019.26.2.149 인용 PDF KSCI

A Score test for Detection of Outliers in Nonlinear Regression

Kahng, Myung-Wook
- Journal of the Korean Statistical Society
- /
- 제22권2호
- /
- pp.201-208
- /
- 1993
Given the specific mean shift outlier model, the score test for multiple outliers in nonlinear regression is discussed as an alternative to the likelihood ratio test. The geometric interpretation of the score statistic is also presented.
PDF

가중치 보정을 이용한 다중대체법 (Multiple Imputation Reducing Outlier Effect using Weight Adjustment Methods)

김진영;신기일
- 응용통계연구
- /
- 제26권4호
- /
- pp.635-647
- /
- 2013
다중 대체법은 표본조사에서 결측값이 발생하였을 때 가장 흔히 사용하는 방법이다. 이 방법은 여러 요인에 의해 그 성능이 좌우되며 특히 이상점의 영향을 많이 받는다. 본 연구에서는 가중치 보정법을 이용하여 이상점의 영향력을 줄여 다중 대체법의 성능을 향상시키는 방법을 연구하였다. 가중치 보정법을 이용하여 얻어진 최종 가중치를 다중대체에 사용하였으며 SAS의 PROC MI가 다중 대체를 위해 사용되었다. 모의실험과 매월노동통계 자료를 이용한 실제 자료 분석을 통하여 제안된 방법의 우수성을 확인하였다.
https://doi.org/10.5351/KJAS.2013.26.4.635 인용 PDF KSCI

이상점 영향력 축소를 통한 무응답 대체법 (A Multiple Imputation for Reducing Outlier Effect)

김만겸;신기일
- 응용통계연구
- /
- 제27권7호
- /
- pp.1229-1241
- /
- 2014
이상점과 무응답이 동시에 존재하는 경우에는 무응답만 있는 경우에 비해 무응답 대체의 성능이 떨어지게 된다. 이러한 경우에는 먼저 이상점을 탐지하고, 탐지된 이상점의 영향력을 축소한 후 무응답 대체를 실시하여야 한다. 본 논문에서는 이상점의 영향력을 축소하여 무응답 대체법의 성능을 향상시키는 방법을 연구하였다. 이를 위해 She and Owen (2011)이 제안한 이상점 탐지법을 살펴보았고, 탐지된 이상점의 영향력을 줄이기 위한 방법으로 흔히 사용되는 가중치 조정법과 이상점 대체법을 살펴보았다. 또한 이상점 처리 방법을 적용한 무응답 대체법을 살펴보았으며 모의실험과 사례분석을 통하여 이상점 영향력 축소 효과를 살펴보았다.
https://doi.org/10.5351/KJAS.2014.27.7.1229 인용 PDF KSCI

Detecting Multiple Outliers Using the Gaps of Order Statistics

Kim, Hyun Chul
- Communications for Statistical Applications and Methods
- /
- 제2권2호
- /
- pp.184-197
- /
- 1995
An objective and one-step detection procedure of multiple outliers is suggested by using the gaps of the order statistics. The detection procedure can be used as a routine outlier detection method of a statistical analysis computer program. The procedure is applied to some examples including the data selected by Kitagawa.
PDF

다수 계측 데이터에 대한 복합 이상치 평가 및 검증 (Compound Outlier Assessment and Verification for Multiple Field Monitoring Data)

전제성
- 한국지반환경공학회 논문집
- /
- 제19권1호
- /
- pp.5-14
- /
- 2018
건설 현장에서 생산되는 각종 계측 데이터 내에는 다양한 원인에서 생성된 각종 이상 데이터가 포함되어 있다. 본 연구에서는 시계열 데이터 내에 포함된 이상 데이터의 효과적 판정을 위한 합성신호 생성 기법과 그를 이용한 회귀분석, 최종적인 이상 데이터 판단과 평가 등에 관한 연구를 수행하였다. 방대한 데이터로 구성된 다수 데이터셋에 대한 이상 데이터 평가 시 다수의 데이터셋 간의 상관성을 가중치로 한 합성신호는 특정 데이터셋 과의 상관성을 크게 향상 시키는 효과를 보였으며, 이를 통해 효과적인 이상 데이터 판정이 가능하였다. 인위적 이상 데이터가 포함된 인공 오류 데이터를 생성하고 이에 합성신호 기법을 적용한 결과, 이상 데이터 판정 정확도가 크게 증가 하였으며 이러한 결과는 이종 시계열 모델의 경우에서도 동일하게 확인되었다. 이상 데이터 판정의 정확도는 신호 합성에 이용되는 데이터셋 수가 많고 시계열 모델 특성이 유사할수록 크게 증가하였다.
https://doi.org/10.14481/jkges.2018.19.1.5 인용 PDF

검색결과 23건 처리시간 0.03초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)