• Title/Summary/Keyword: outlier's test

Search Result 36, Processing Time 0.022 seconds

Method of Processing the Outliers and Missing Values of Field Data to Improve RAM Analysis Accuracy (RAM 분석 정확도 향상을 위한 야전운용 데이터의 이상값과 결측값 처리 방안)

  • Kim, In Seok;Jung, Won
    • Journal of Applied Reliability
    • /
    • v.17 no.3
    • /
    • pp.264-271
    • /
    • 2017
  • Purpose: Field operation data contains missing values or outliers due to various causes of the data collection process, so caution is required when utilizing RAM analysis results by field operation data. The purpose of this study is to present a method to minimize the RAM analysis error of the field data to improve the accuracy. Methods: Statistical methods are presented for processing of the outliers and the missing values of the field operating data, and after analyzing the RAM, the differences between before and after applying the technique are discussed. Results: The availability is estimated to be lower by 6.8 to 23.5% than that before processing, and it is judged that the processing of the missing values and outliers greatly affect the RAM analysis result. Conclusion: RAM analysis of OO weapon system was performed and suggestions for improvement of RAM analysis were presented through comparison with the new and current method. Data analysis results without appropriate treatment of error values may result in incorrect conclusions leading to inappropriate decisions and actions.

Probabilistic Distribution and Variability of Geotechnical Properties with Randomness Characteristic (무작위성을 보이는 지반정수의 확률분포 및 변동성)

  • Kim, Dong-Hee;Lee, Ju-Hyoung;Lee, Woo-Jin
    • Journal of the Korean Geotechnical Society
    • /
    • v.25 no.11
    • /
    • pp.87-103
    • /
    • 2009
  • To determine the reliable probabilistic distribution model of geotechnical properties, outlier and randomness test for analysis data, parameter estimation of probabilistic distribution model, and goodness-of-fit test for model parameter and probabilistic distribution model have to be performed in sequence. In this paper, the probabilistic distribution model's geotechnical properties of Songdo area in Incheon are estimated by the above proposed procedure. Also, the coefficient of variation (COV) representing the variability of geotechnical properties is determined for several geotechnical properties. Reliable probabilistic distribution model and COV of geotechnical properties can be used for probability-based design procedure and reasonable choice of design value in deterministic design method.

Evaluation of Major Taper Equation Models for Developing a Stem Volume Table of Cryptomeria japonica in Jeju Island (제주도 삼나무 수간재적표 개발을 위한 주요 수간곡선식 비교)

  • Hyun-Soo, Kim;Su-Young, Jung;Kwang-Soo, Lee
    • Journal of Environmental Science International
    • /
    • v.31 no.11
    • /
    • pp.941-950
    • /
    • 2022
  • This study was conducted to provide data and stem information to establish a local volume table of Cryptomeria japonica in Jeju Island. Stem analysis was performed on 26 trees by selecting two average trees from each site of the 13 plots of C. japonica stands in 2021 and 2022. During the analysis stage, one outlier tree was rejected, and a total of 260 observations of the specific stem height of 25 trees were used. Of the seven major taper equation models applied for parameter estimation and statistical verification, the Muhairwe 1999 model was found to be the best fit and selected as the optimal model. Stem shape-related estimates were acquired through the selected model, and sectional measurements according to the Smalian formula applied at an interval of 10 cm from the height of the stem were used to develop a volume table. A paired t-test comparison between the C. japonica volume obtained from the present study and those selected from the current yield table by NIFoS(2020), revealed significant differences (p<0.05), highlighting the necessity of a local volume table for C. japonica in Jeju Island.

Analysis on Characteristics of Variation in Flood Flow by Changing Order of Probability Weighted Moments (확률가중모멘트의 차수 변화에 따른 홍수량 변동 특성 분석)

  • Maeng, Seung-Jin;Hwang, Ju-Ha
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.5
    • /
    • pp.1009-1019
    • /
    • 2009
  • In this research, various characteristics of South Korea's design flood have been examined by deriving appropriate design flood, using data obtained from careful observation of actual floods occurring in selected main watersheds of the nation. 19 watersheds were selected for research in Korea. The various characteristics of annual rainfall were analyzed by using a moving average method. The frequency analysis was decided to be performed on the annual maximum flood of succeeding one year as a reference year. For the 19 watersheds, tests of basic statistics, independent, homogeneity, and outlier were calculated per period of annual maximum flood series. By performing a test using the LH-moment ratio diagram and the Kolmogorov-Smirnov (K-S) test, among applied distributions of Gumbel (GUM), Generalized Extreme Value (GEV), Generalized Logistic (GLO) and Generalized Pareto (GPA) distribution was found to be adequate compared with other probability distributions. Parameters of GEV distribution were estimated by L, L1, L2, L3 and L4-moment method based on the change in the order of probability weighted moments. Design floods per watershed and the periods of annual maximum flood series were derived by GEV distribution. According to the result of the analysis performed by using variation rate used in this research, it has been concluded that the time for changing the design conditions to ensure the proper hydraulic structure that considers recent climate changes of the nation brought about by global warming should be around the year 2002.

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.

Establishment of National Quality Control System for Analytical Laboratory of Pesticide Products by Proficiency Testing (농약 이화학시험 분석기관의 숙련도시험을 통한 정도관리체계 확립 연구)

  • Chang, Hee-Ra;Park, Hyo-Kyung;Lim, Youngjoo;Kim, Kwang-Ho;Kim, Chan Sub;Kim, Kyun
    • The Korean Journal of Pesticide Science
    • /
    • v.16 no.4
    • /
    • pp.350-356
    • /
    • 2012
  • Performance of proficiency testing and the validation of analytical method was included a scheme of quality assurance in analytical chemistry laboratory to monitor a laboratory's performance abilities and produce consistently reliable data. This study was assessed the applicability of proficiency testing scheme proposed for analytical laboratories of pesticide product in domestic. The validation of analytical methods, stability and homogeneity for formulated pesticide products (emulsifiable concentrate) of emamectin benzoate and lufenuron was confirmed for the proficiency testing. The z-score of 33 participation laboratories for emamectin benzoate were that the numbers of outlier were 2 laboratories (6.0%), z-score outside the range from -3 to 3 designated "unaccptable" were 2 laboratories and z-score in the ranges -2 to -3 and 2 to 3 designated "questionable" were 3 laboratories (9.0%). Three laboratories (9.0%) showed the z-score designated "questionable" for lufenuron. The additional proficiency testing for various product types will be needed to establish the scheme of quality control.