• Title/Summary/Keyword: outlier test

Search Result 109, Processing Time 0.026 seconds

Impact of Outliers on the Statistical Measures of the Environmental Monitoring Data in Busan Coastal Sea (이상자료가 연안 환경자료의 통계 척도에 미치는 영향)

  • Cho, Hong-Yeon;Lee, Ki-Seop;Ahn, Soon-Mo
    • Ocean and Polar Research
    • /
    • v.38 no.2
    • /
    • pp.149-159
    • /
    • 2016
  • The statistical measures of the coastal environmental data are used in a variety of statistical inferences, hypothesis tests, and data-driven modeling. If the measures are biased, then the statistical estimations and models may also be biased and this potential for bias is great when data contain some outliers defined as extraordinary large or small data values. This study aims to suggest more robust statistical measures as alternatives to more commonly used measures and to assess the performance these robust measures through a quantitative evaluation of more typical measures, such as in terms of locations, spreads, and shapes, with regard to environmental monitoring data in the Busan coastal sea. The detection of outliers within the data was carried out on the basis of Rosner's test. About 5-10% of the nutrient data were found to contain outliers based on Rosner's test. After removal (zero-weighting) of the outliers in the data sets, the relative change ratios of the mean and standard deviation between before and after outlier-removal conditions revealed the figures 13 and 33%, respectively. The variation magnitudes of skewness and kurtosis are 1.36 and 8.11 in a decreasing trend, respectively. On the other hand, the change ratios for more robust measures regarding the mean and standard deviation are 3.7-10.5%, and the variation magnitudes of robust skewness and kurtosis are about only 2-4% of the magnitude of the non-robust measures. The robust measures can be regarded as outlier-resistant statistical measures based on the relatively small changes in the scenarios before and after outlier removal conditions.

A study on the Flood Frequency Analyzed in Consideration of Low Outliers. (Low Outliers를 고려한 홍수빈도분석에 관한 연구)

  • 이순혁;홍성표;박명근
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.30 no.4
    • /
    • pp.62-70
    • /
    • 1988
  • This study was conducted to solve the problems for the unsuitable parameters and the uncertainty of design flood can be appeared by low outliers were inclined to the lower part from the trend of the balance of the data. Derivation of reasonable design flood was attempted finally by modification of low outliers with analysis of flood frequency by means of Log Pearson Type Ill distribution. Three subwatersheds were selected as studying basins with the annual maximum series including low outliers along Geum River basin. The results through this study were analyzed and summarized as follows. 1. Log Pearson Type In distribution was confirmed as a reasonable one by X$^2$ goodness of fit test at Gong Ju, Gyu Am, og Cheon watershed along Geum River basin. 2. Probable flood flows for each watershed were derivated by flood frequency curve with outliers. 3. Weighted skew coefficient for each watershed was calculated for the evaluation of freq- uency factor which is needed for the modification of low outlier. 4. It was confirrned that adjusted frequency curve has a lower tendency than that of deletion of low outlier in common at all watersheds. 5. Final probable flood flows were derivated by modification with evaluation of modified basic statistics for three watersheds. 6. In comparison with a frequency curve with modification and one with outlier, The former has a higher probable flood flow within three years of return periods than that of the latter, and vice versa over three years of return periods.

  • PDF

Influence in Testing the Equality of Two Covariance Matrices (두개의 공분산 행렬의 동질성 검정에서의 영향치 분석)

  • Myung Geun Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.7 no.2
    • /
    • pp.213-224
    • /
    • 1994
  • A diagnostic method useful for detecting outliers in testing the equality of two covariance metrics is developed using the influence curve approach. This method is easily generalized to more than two covariance matrices. A sample version for the influence measure of detecting outliers is considered based on the empirical distribution functions. The sample version includes as its component terms the well-known test statistic for detecting one outlier at a time introduced by Wilks and its generalization to the two-group case.

  • PDF

The Outlier-Filtering Algorithm for National Highway Continuous Traffic Counts Data (일반국도 상시조사 교통량 자료의 이상치 판정 알고리즘 개발)

  • Shin, Jae Myong;Lee, Sang Hyup;Kim, Hyun Suk
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.33 no.2
    • /
    • pp.691-702
    • /
    • 2013
  • In this study the quantitative outlier-filtering algorithm has been developed using the smoothing method based on the day-of-the-week traffic volume variation pattern and then, in order to test the effectiveness of the algorithm, it has been used to identify outliers from the traffic volume data collected at 14 continuous traffic counts sites on the national highways in the year 2010. The test results are satisfactory since the filtering rate is 98.2% for normal days and the mis-filtering rate is 8.0% for abnormal days. Therefore, the algorithm will be able to be used for roughly-but-quickly filtering outliers from the collected traffic volume data.

The Mean Reverting Behavior of Inflation in the Philippines

  • CAMBA, Abraham C. Jr.;CAMBA, Aileen L.
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.10
    • /
    • pp.239-247
    • /
    • 2021
  • Central Bank authorities should carefully manage inflation rate uncertainties to achieve economic growth and development not only in the short-run but also in the long-run. Since inflation is a key macroeconomic variable, an increased understanding about its behavior is undoubtedly important. Thus, paper employs unit root with breakpoints to examine the mean reverting behavior of inflation rate in the Philippines using monthly data from 2002 to 2020. Empirically, the unit root breakpoint innovational and additive outlier tests favor the stationarity or mean reverting behavior of inflation in the Philippines. Also, results of standard unit root tests, ADF, PP, GLS-Dickey-Fuller, KPSS and NP, provide strong evidence of mean reverting processes. The mean reverting behavior of inflation rate reveals that the monetary policy using inflation targeting framework has succeeded in reducing chronic inflation persistence in the Philippines. Thus, this research supports inflation targeting policy that aims to maintain general price level stability for the Philippine economy's long-term growth and development prospects. The findings of this research remain important for the central bankers for not only providing them better understanding about the behavior of inflation rate, but also helping them formulate and implement policy reforms related to money, credit and banking.

A Study of the Roust Degradation Model by Analyzing the Filament Lamp Degradation Data (헤드램프용 필라멘트 램프 가속열화데이터 분석을 통한 로버스트 열화모형 연구)

  • Sung, Ki-Woo
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.20 no.6
    • /
    • pp.132-139
    • /
    • 2012
  • It is generally needed to test durability and lifetime when we develop parts in new technology. In this paper, the accelerated degradation analysis methods are developed to test them. This study is presented robust model estimation method that is less affected by outlier in regresstion model estimation. In addition, the lifetime can be predicted by Degradation-stress relationship in stress level.

Outlier tests on potential outliers (잠재적 이상치군에 대한 검정)

  • Seo, Han Son
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.159-167
    • /
    • 2017
  • Observations identified as potential outliers are usually tested for real outliers; however, some outlier detection methods skip a formal test or perform a test using simulated p-values. We introduce test procedures for outliers by testing subsets of potential outliers rather than by testing individual observations of potential outliers to avoid masking or swamping effects. Examples to illustrate methods and a Monte Carlo study to compare the power of the various methods are presented.

Estimation of Design Rainfall Using 3 Parameter Probability Distributions (3변수 확률분포에 의한 설계강우량 추정)

  • Lee, Soon Hyuk;Maeng, Sung Jin;Ryoo, Kyong Sik
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2004.05b
    • /
    • pp.595-598
    • /
    • 2004
  • This research seeks to derive the design rainfalls through the L-moment with the test of homogeneity, independence and outlier of data on annual maximum daily rainfall at 38 rainfall stations in Korea. To select the appropriate distribution of annual maximum daily rainfall data by the rainfall stations, Generalized Extreme Value (GEV), Generalized Logistic (GLO), Generalized Pareto (GPA), Generalized Normal (GNO) and Pearson Type 3 (PT3) probability distributions were applied and their aptness were judged using an L-moment ratio diagram and the Kolmogorov-Smirnov (K-S) test. Parameters of appropriate distributions were estimated from the observed and simulated annual maximum daily rainfall using Monte Carlo techniques. Design rainfalls were finally derived by GEV distribution, which was proved to be more appropriate than the other distributions.

  • PDF

Frequency Analysis of Extreme Rainfall Using 3 Parameter Probability Distributions (3변수 확률분포형에 의한 극치강우의 빈도분석)

  • Kim, Byeong-Jun;Maeng, Sung-Jin;Ryoo, Kyong-Sik;Lee, Soon-Hyuk
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.46 no.3
    • /
    • pp.31-42
    • /
    • 2004
  • This research seeks to derive the design rainfalls through the L-moment with the test of homogeneity, independence and outlier of data on annual maximum daily rainfall at 38 rainfall stations in Korea. To select the appropriate distribution of annual maximum daily rainfall data by the rainfall stations, Generalized Extreme Value (GEV), Generalized Logistic (GLO), Generalized Pareto (GPA), Generalized Normal (GNO) and Pearson Type 3 (PT3) probability distributions were applied and their aptness were judged using an L-moment ratio diagram and the Kolmogorov-Smirnov (K-S) test. Parameters of appropriate distributions were estimated from the observed and simulated annual maximum daily rainfall using Monte Carlo techniques. Design rainfalls were finally derived by GEV distribution, which was proved to be more appropriate than the other distributions.

A Study on Applications of Regression Diagnostic Method to Technometrics, and the Statistical Quality Control

  • Kim, Soon-Kwi
    • Journal of Korean Society for Quality Management
    • /
    • v.21 no.1
    • /
    • pp.55-64
    • /
    • 1993
  • This article is concerned with procedures for detecting one or more outliers or influential observations in a linear regression model. A test procedure, based on recursive residuals is proposed and developed The power of the test procedure to identify one or more outliers is investigated through simulation, and its relevance to the number and configuration of the outlier.

  • PDF