• Title/Summary/Keyword: kolmogorov-smirnov test

Search Result 209, Processing Time 0.029 seconds

Regional Frequency Analysis for Rainfall using L-Moment (L-모멘트법에 의한 강우의 지역빈도분석)

  • Koh, Deuk-Koo;Choo, Tai-Ho;Maeng, Seung-Jin;Trivedi, Chanda
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.3
    • /
    • pp.252-263
    • /
    • 2008
  • This study was conducted to derive the optimal regionalization of the precipitation data which can be classified on the basis of climatologically and geographically homogeneous regions all over the regions except Cheju and Ulreung islands in Korea. A total of 65 rain gauges were used to regional analysis of precipitation. Annual maximum series for the consecutive durations of 1, 3, 6, 12, 24, 36, 48 and 72hr were used for various statistical analyses. K-means clustering mettled is used to identify homogeneous regions all over the regions. Five homogeneous regions for the precipitation were classified by the K-means clustering. Using the L-moment ratios and Kolmogorov-Smirnov test, the underlying regional probability distribution was identified to be the generalized extreme value (GEV) distribution among applied distributions. The regional and at-site parameters of the generalized extreme value distribution were estimated by the linear combination of the probability weighted moments, L-moment. The regional and at-site analysis for the design rainfall were tested by Monte Carlo simulation. Relative root-mean-square error (RRMSE), relative bias (RBIAS) and relative reduction (RR) in RRMSE were computed and compared with those resulting from at-site Monte Carlo simulation. All show that the regional analysis procedure can substantially reduce the RRMSE, RBIAS and RR in RRMSE in the prediction of design rainfall. Consequently, optimal design rainfalls following the regions and consecutive durations were derived by the regional frequency analysis.

Median Control Chart for Nonnormally Distributed Processes (비정규분포공정에서 매디안특수관리도의 모형설계와 적용연구)

  • 신용백
    • Journal of the Korean Professional Engineers Association
    • /
    • v.20 no.3
    • /
    • pp.15-25
    • /
    • 1987
  • Statistical control charts are useful tools to monitor and control the manufacturing processes and are widely used in most Korean industries. Many Korean companies, however, do not always obtain desired results from the traditional control charts by Shewhart such as the X-chart, X-chart, X-chart, etc. This is partly because the quality charterstics of the process are not distributed normally but are skewed due to the intermittent production, small lot size, etc. In Shewhart X-chart, which is the most widely used one in Korea, such skewed distributions make the plots to be inclined below or above the central line or outside the control limits although no assignable causes can be found. To overcome such shortcomings in nonnormally distributed processes, a distribution-free type of confidence interval can be used, which should be based on order statistics. This thesis is concerned with the design of control chart based on a sample median which is easy to use in practical situation and therefore properties for nonnormal distributions may be easily analyzed. Control limits and central lines are given for tile more famous nonnormal distributions, such as Gamma, Beta, Lognormal, Weibull, Pareto, Truncated-normal distributions. Robustness of the proposed median control chart is compared with that of the X-chart, the former tends to be superior to the latter as the probability distribution of the process becomes more skewed. The average run length to detect the assignable cause is also compared when the process has a Normal or a Gamma distribution for which the properties of X are easy to verify, the proposed chart is slightly worse than the X-chart for the normally distributed product but much better for Gamma-distributed products. Average Run Lengths of the other distributions are also computed. To use the proposed control chart, the probability distribution of the process should be known or estimated. If it is not possible, the results of comparison of the robustness force us to use the proposed median control chart based on a normal distribution. To estimate the distribution of the process, Sturge's formula is used to graph the histogram and the method of probability plotting, $X^2$-goodness of fit test and Kolmogorov-Smirnov test, are discussed with real case examples. A comparison of the propose4 median chart and the X chart was also performed with these examples and the median chart turned out to be superior to the X-chart.

  • PDF

Effect of frontal facial type and sex on preferred chin projection

  • Choi, Jin-Young;Kim, Taeyun;Kim, Hyung-Mo;Lee, Sang-Hoon;Cho, Il-sik;Baek, Seung-Hak
    • The korean journal of orthodontics
    • /
    • v.47 no.2
    • /
    • pp.108-117
    • /
    • 2017
  • Objective: To investigate the effects of frontal facial type (FFT) and sex on preferred chin projection (CP) in three-dimensional (3D) facial images. Methods: Six 3D facial images were acquired using a 3D facial scanner (euryprosopic [Eury-FFT], mesoprosopic [Meso-FFT], and leptoprosopic [Lepto-FFT] for each sex). After normal CP in each 3D facial image was set to $10^{\circ}$ of the facial profile angle (glabella-subnasale-pogonion), CPs were morphed by gradations of $2^{\circ}$ from normal (moderately protrusive [$6^{\circ}$], slightly protrusive [$8^{\circ}$], slightly retrusive [$12^{\circ}$], and moderately retrusive [$14^{\circ}$]). Seventy-five dental students (48 men and 27 women) were asked to rate the CPs ($6^{\circ}$, $8^{\circ}$, $10^{\circ}$, $12^{\circ}$, and $14^{\circ}$) from the most to least preferred in each 3D image. Statistical analyses included the Kolmogorov-Smirnov test, Kruskal-Wallis test, and Bonferroni correction. Results: No significant difference was observed in the distribution of preferred CP in the same FFT between male and female evaluators. In Meso-FFT, the normal CP was the most preferred without any sex difference. However, in Eury-FFT, the slightly protrusive CP was favored in male 3D images, but the normal CP was preferred in female 3D images. In Lepto-FFT, the normal CP was favored in male 3D images, whereas the slightly retrusive CP was favored in female 3D images. The mean preferred CP angle differed significantly according to FFT (Eury-FFT: male, $8.7^{\circ}$, female, $9.9^{\circ}$; Meso-FFT: male, $9.8^{\circ}$, female, $10.7^{\circ}$; Lepto-FFT: male, $10.8^{\circ}$, female, $11.4^{\circ}$; p < 0.001). Conclusions: Our findings might serve as guidelines for setting the preferred CP according to FFT and sex.

A comparative study of established z score models for coronary artery diameters in 181 healthy Korean children

  • Ryu, Kyungguk;Yu, Jeong Jin;Jun, Hyun Ok;Shin, Eun Jung;Heo, Young Hee;Baek, Jae Suk;Kim, Young-Hwue;Ko, Jae-Kon
    • Clinical and Experimental Pediatrics
    • /
    • v.60 no.11
    • /
    • pp.373-378
    • /
    • 2017
  • Purpose: The aim of this study was to investigate the statistical properties of four previously developed pediatric coronary artery z score models in healthy Korean children. Methods: The study subjects were 181 healthy Korean children, whose age ranged from 1 month to 15 years. The diameter of each coronary artery was measured using 2-dimensional echocardiography and converted to the z score in the four models (McCrindle, Olivieri, Dallaire, and Japanese model). Descriptive statistical analyses and 1-sample t tests were performed. Results: All calculated z scores had P values of ${\geq}0.050$ using the Kolmogorov-Smirnov test. The one sample t test showed that the mean z scores did not converge to zero except in 1 model, and the mean right coronary artery (RCA) z score was less than zero in all 4 models. The smaller RCA diameter in this study could be associated with the more distal measuring point used to avoid the conal branch. The percentage of subjects with extreme z score values (${\geq}2.0$ and ${\geq}2.5$) for the left main coronary artery (LMCA) seems to be higher in the Dallaire (4.9% and 3.3%) and Japanese models (7.1% and 3.8%). Conclusion: All 4 models showed statistical feasibility of normal distribution. More precise instructions would be needed for the measurement of the RCA. The higher percentage of extreme z scores for the LMCA is compatible with the basic understanding of anatomic variation in the LMCA.

Median Control Chart for Nonnormally Distributed Processes (비정규분포공정에서 메디안특수관리도 통용모형설정에 관한 실증적 연구(요약))

  • 신용백
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.10 no.16
    • /
    • pp.101-106
    • /
    • 1987
  • Statistical control charts are useful tools to monitor and control the manufacturing processes and are widely used in most Korean industries. Many Korean companies, however, do not always obtain desired results from the traditional control charts by Shewhart such as the $\bar{X}$-chart, $\bar{X}$-chart, $\bar{X}$-chart, etc. This is partly because the quality charterstics of the process are not distributed normally but are skewed due to the intermittent production, small lot size, etc. In Shewhart $\bar{X}$-chart. which is the most widely used one in Kora, such skewed distributions make the plots to be inclined below or above the central line or outside the control limits although no assignable causes can be found. To overcome such shortcomings in nonnormally distributed processes, a distribution-free type of confidence interval can be used, which should be based on order statistics. This thesis is concerned with the design of control chart based on a sample median which is easy to use in practical situation and therefore properties for nonnormal distributions may be easily analyzed. Control limits and central lines are given for the more famous nonnormal distributions, such as Gamma, Beta, Lognormal, Weibull, Pareto, Truncated-normal distributions. Robustness of the proposed median control chart is compared with that of the $\bar{X}$-chart; the former tends to be superior to the latter as the probability distribution of the process becomes more skewed. The average run length to detect the assignable cause is also compared when the process has a Normal or a Gamma distribution for which the properties of X are easy to verify, the proposed chart is slightly worse than the $\bar{X}$-chart for the normally distributed product but much better for Gamma-distributed products. Average Run Lengths of the other distributions are also computed. To use the proposed control chart, the probability distribution of the process should be known or estimated. If it is not possible, the results of comparison of the robustness force us to use the proposed median control chart based oh a normal distribution. To estimate the distribution of the process, Sturge's formula is used to graph the histogram and the method of probability plotting, $\chi$$^2$-goodness of fit test and Kolmogorov-Smirnov test, are discussed with real case examples. A comparison of the proposed median chart and the $\bar{X}$ chart was also performed with these examples and the median chart turned out to be superior to the $\bar{X}$-chart.

  • PDF

Quantitative Estimation Method for ML Model Performance Change, Due to Concept Drift (Concept Drift에 의한 ML 모델 성능 변화의 정량적 추정 방법)

  • Soon-Hong An;Hoon-Suk Lee;Seung-Hoon Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.259-266
    • /
    • 2023
  • It is very difficult to measure the performance of the machine learning model in the business service stage. Therefore, managing the performance of the model through the operational department is not done effectively. Academically, various studies have been conducted on the concept drift detection method to determine whether the model status is appropriate. The operational department wants to know quantitatively the performance of the operating model, but concept drift can only detect the state of the model in relation to the data, it cannot estimate the quantitative performance of the model. In this study, we propose a performance prediction model (PPM) that quantitatively estimates precision through the statistics of concept drift. The proposed model induces artificial drift in the sampling data extracted from the training data, measures the precision of the sampling data, creates a dataset of drift and precision, and learns it. Then, the difference between the actual precision and the predicted precision is compared through the test data to correct the error of the performance prediction model. The proposed PPM was applied to two models, a loan underwriting model and a credit card fraud detection model that can be used in real business. It was confirmed that the precision was effectively predicted.

Stochastic investigation on three-dimensional diffusion of chloride ions in concrete

  • Ye Tian;Yifei Zhu;Guoyi Zhang;Zhonggou Chen;Huiping Feng;Nanguo Jin;Xianyu Jin;Hongxiao Wu;Yinzhe Shao;Yu Liu;Dongming Yan;Zheng Zhou;Shenshan Wang;Zhiqiang Zhang
    • Computers and Concrete
    • /
    • v.32 no.3
    • /
    • pp.247-261
    • /
    • 2023
  • Due to the non-uniform distribution of meso-structure, the diffusion of chloride ions in concrete show the characteristics of characteristics of randomness and fuzziness, which leads to the non-uniform distribution of chloride ions and the non-uniform corrosion of steel rebar in concrete. This phenomenon is supposed as the main reason causing the uncertainty of the bearing capacity deterioration of reinforced concrete structures. In order to analyze and predict the durability of reinforced concrete structures under chloride environment, the random features of chloride ions transport in concrete were studied in this research from in situ meso-structure of concrete. Based on X-ray CT technology, the spatial distribution of coarse aggregates and pores were recognized and extracted from a cylinder concrete specimen. In considering the influence of ITZ, the in situ mesostructure of concrete specimen was reconstructed to conduct a numerical simulation on the diffusion of chloride ions in concrete, which was verified through electronic microprobe technology. Then a stochastic study was performed to investigate the distribution of chloride ions concentration in space and time. The research indicates that the influence of coarse aggregate on chloride ions diffusion is the synthetic action of tortuosity and ITZ effect. The spatial distribution of coarse aggregates and pores is the main reason leading to the non-uniform distribution of chloride ions both in spatial and time scale. The chloride ions concentration under a certain time and the time under a certain concentration both satisfy the Lognormal distribution, which are accepted by Kolmogorov-Smirnov test and Chi-square test. This research provides an efficient method for obtain mass stochastic data from limited but representative samples, which lays a solid foundation for the investigation on the service properties of reinforced concrete structures.

Uncertainty Assessment of Emission Factors for Pinus densiflora using Monte Carlo Simulation Technique (몬테 카를로 시뮬레이션을 이용한 소나무 탄소배출계수의 불확도 평가)

  • Pyo, Jung Kee;Son, Yeong Mo;Jang, Gwang Min;Lee, Young Jin
    • Journal of Korean Society of Forest Science
    • /
    • v.102 no.4
    • /
    • pp.477-483
    • /
    • 2013
  • The purpose of this study was to calculate uncertainty of emission factor collected data and to evaluate the applicability of Monte Carlo simulation technique. To estimate the distribution of emission factors (Such as Basic wood density, Biomass expansion factor, and Root-to-shoot ratio), four probability density functions (Normal, Lognormal, Gamma, and Weibull) were used. The two sample Kolmogorov-Smirnov test and cumulative density figure were used to compare the optimal probability density function. It was observed that the basic wood density showed the gamma distribution, the biomass expansion factor results the log-normal distribution, and root-shoot ratio showd the normal distribution for Pinus densiflora in the Gangwon region; the basic wood density was the normal distribution, the biomass expansion factor was the gamma distribution, and root-shoot ratio was the gamma distribution for Pinus densiflora in the central region, respectively. The uncertainty assessment of emission factor were upper 62.1%, lower -52.6% for Pinus densiflora in the Gangwon region and upper 43.9%, lower -34.5% for Pinus densiflora in the central region, respectively.

A study on collecting representative food samples for the 10th Korean standard foods composition table (국가표준식품성분 데이터베이스 대표시료 선정을 위한 표본설계)

  • Kim, Jinheum;Hwang, Hae-Won;Cho, Yu Jung;Park, Jinwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.2
    • /
    • pp.215-228
    • /
    • 2020
  • Under Article 19, Paragraph 1 of the Food Industry Promotion Act, Rural Development Administration renews the Korean foods composition table every five years. Before the publication of the tenth revision of the Korean foods composition table in 2021, this paper suggests methods for collecting representative samples of 182 highly consumed foods in Korea. Food markets are categorized by their distribution channels, which are supermarkets and local markets. Eight samples are collected from each category by applying the National Food and Nutrient Analysis Program (NFNAP)'s stratified multi-stage sampling. The NFNAP was implemented in 1997 as a collaborative food composition research effort between the National Institute of Health (NIH) and the US Department of Agriculture (USDA) to secure reliable estimates for the nutrient content of food and beverages consumed by the US population. Selected supermarkets for selecting representative food samples are Emart Kayang, Homeplus Siheung, Lottemart Dongducheon, Emart Suwon, Lottemart Dunsan, Lottemart Yeosu, Emart Ulsan, and Hanaroclub Ulsan. Selected local markets also are Doksandongusijang in Geumcheon-gu and Pungnapsijang in Songpa-gu, Seoul, Ilsansijang in Ilsanseo-gu, Goyang, Unamsijang in Buk-gu, Gwangju, Beopdongsijang in Daedeok-gu, Daejeon, Bongnaesijang in Yeongdo-gu and Jwadongjaeraesijang in Haeundae-gu, Busan, and Jungangsijang in Jinhae-gu, Changwon.

Estimation of Drought Rainfall by Regional Frequency Analysis Using L and LH-Moments (II) - On the method of LH-moments - (L 및 LH-모멘트법과 지역빈도분석에 의한 가뭄우량의 추정 (II)- LH-모멘트법을 중심으로 -)

  • Lee, Soon-Hyuk;Yoon , Seong-Soo;Maeng , Sung-Jin;Ryoo , Kyong-Sik;Joo , Ho-Kil;Park , Jin-Seon
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.46 no.5
    • /
    • pp.27-39
    • /
    • 2004
  • In the first part of this study, five homogeneous regions in view of topographical and geographically homogeneous aspects except Jeju and Ulreung islands in Korea were accomplished by K-means clustering method. A total of 57 rain gauges were used for the regional frequency analysis with minimum rainfall series for the consecutive durations. Generalized Extreme Value distribution was confirmed as an optimal one among applied distributions. Drought rainfalls following the return periods were estimated by at-site and regional frequency analysis using L-moments method. It was confirmed that the design drought rainfalls estimated by the regional frequency analysis were shown to be more appropriate than those by the at-site frequency analysis. In the second part of this study, LH-moment ratio diagram and the Kolmogorov-Smirnov test on the Gumbel (GUM), Generalized Extreme Value (GEV), Generalized Logistic (GLO) and Generalized Pareto (GPA) distributions were accomplished to get optimal probability distribution. Design drought rainfalls were estimated by both at-site and regional frequency analysis using LH-moments and GEV distribution, which was confirmed as an optimal one among applied distributions. Design rainfalls were estimated by at-site and regional frequency analysis using LH-moments, the observed and simulated data resulted from Monte Carlotechniques. Design drought rainfalls derived by regional frequency analysis using L1, L2, L3 and L4-moments (LH-moments) method have shown higher reliability than those of at-site frequency analysis in view of RRMSE (Relative Root-Mean-Square Error), RBIAS (Relative Bias) and RR (Relative Reduction) for the estimated design drought rainfalls. Relative efficiency were calculated for the judgment of relative merits and demerits for the design drought rainfalls derived by regional frequency analysis using L-moments and L1, L2, L3 and L4-moments applied in the first report and second report of this study, respectively. Consequently, design drought rainfalls derived by regional frequency analysis using L-moments were shown as more reliable than those using LH-moments. Finally, design drought rainfalls for the classified five homogeneous regions following the various consecutive durations were derived by regional frequency analysis using L-moments, which was confirmed as a more reliable method through this study. Maps for the design drought rainfalls for the classified five homogeneous regions following the various consecutive durations were accomplished by the method of inverse distance weight and Arc-View, which is one of GIS techniques.