• Title/Summary/Keyword: Normal distribution

Search Result 3,601, Processing Time 0.034 seconds

Skew Normal Boxplot and Outliers

  • Huh, Myung-Hoe;Lee, Yong-Goo
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.591-595
    • /
    • 2012
  • We frequently use Tukey's boxplot to identify outliers in the batch of observations of the continuous variable. In doing so, we implicitly assume that the underlying distribution belongs to the family of normal distributions. Such a practice of data handling is often superficial and improper, since in reality too many variables manifest the skewness. In this short paper, we build a modified boxplot and set the outlier identification procedure by assuming that the observations are generated from the skew normal distribution (Azzalini, 1985), which is an extension of the normal distribution. Statistical performance of the proposed procedure is examined with simulated datasets.

Estimating Discriminatory Power with Non-normality and a Small Number of Defaults

  • Hong, C.S.;Kim, H.J.;Lee, J.L.
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.803-811
    • /
    • 2012
  • For credit evaluation models, we extend the study of discriminatory power based on AUC obtained from a ROC curve when the number of defaults is small and distribution functions of the defaults and non-defaults are normal distributions. Since distribution functions do not satisfy normality in real world, the distribution functions of the defaults and non-defaults are assumed as normal mixture distributions based on results that the normal mixture could be better fitted than other distribution estimation methods for non-normal data. By using several AUC statistics, the discriminatory power under such a circumstance is explored and compared with those of normal distributions.

Reliability Analysis of the Non-normal Probability Problem for Limited Area using Convolution Technique (컨볼루션 기법을 이용한 영역이 제한된 비정규 확률문제의 신뢰성 해석)

  • Lee, Hyunman;Kim, Taegon;Choi, Won;Suh, Kyo;Lee, JeongJae
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.55 no.5
    • /
    • pp.49-58
    • /
    • 2013
  • Appropriate random variables and probability density functions based on statistical analysis should be defined to execute reliability analysis. Most studies have focused on only normal distributions or assumed that the variables showing non-normal characteristics follow the normal distributions. In this study, the reliability problem with non-normal probability distribution was dealt with using the convolution method in the case that the integration domains of variables are limited to a finite range. The results were compared with the traditional method (linear transformation of normal distribution) and Monte Carlo simulation method to verify that the application was in good agreement with the characteristics of probability density functions with peak shapes. However it was observed that the reproducibility was slightly reduced down in the tail parts of density function.

A Study on the Normal Values of Lead Exposure Indices (연폭로 지표들의 정상치에 관한 연구)

  • Shin, Hai-Rim;Kim, Joon-Youn
    • Journal of Preventive Medicine and Public Health
    • /
    • v.19 no.2 s.20
    • /
    • pp.167-176
    • /
    • 1986
  • For the purpose of determinating the normal values of some parameters relevant to lead exposure, a study was carried out from April 1 to June 30, 1986 on 258 healthy Korean adults who have had no apparant lead exposure. The lead indices subjected to this study were as follows; blood lead (PbB), hemoglobin (Hb), zinc protoporphyrin in blood (ZPP), delta-aminolevulinic acid dehydratase (ALAD) activity in blood, coproporphyrin in urine (CPU), delta-aminolevulinic acid in urine (ALAU). 1) The mean value of PbB was $17.17{\pm}7.87{\mu}g/100ml$, and there was no statistically significant difference by age & sex. The distribution of PbB fitted to the log-normal distribution ($x^2=7.38$, p>0.1). 2) The mean value of Hb in male ($15.17{\pm}1.56g/100ml$) was higher than in female ($13.22{\pm}1.51g/100ml$)(p<0.01). The distribution of Hb fitted to the normal distribution ($x^2=9.40$, p>0.1). 3) The mean value of ZPP was $32.61{\pm}8.78{\mu}g/100ml$, and there was no statistically significant difference by age & sex. The distribution of ZPP fitted to the normal distribution ($x^2=13.93$, p>0.05). The correlation of ZPP & ALAD (r=-0.229), CPU (r=0.183) was statistically significant respectively. 4) The mean value of ALAD was $30.20{\pm}10.96{\mu}mol$ ALA/min/L of R.B.C., and there was no statistically significant difference by age & sex. The distribution of ALAD activity did not fit to the normal distribution. The correlation between ALAD & PbB (r=-0.219) was statistically significant 5) The mean value of CPU was $36.10{\pm}24.54{\mu}g/L$, and there was no statistically significant difference by age & sex. The distribution of CPU did not fit to the normal distribution. The correlation between CPU & PbB (r=0.185), ZPP (r=0.183) was statistically signinificant respectively. 6) The mean value of ALAU was $1.94{\pm}0.96mg/L$, and there was no statistically significant difference by age & sex. The distribution of ALAU fitted to the normal distribution ($x^2=9.76$, p>0.1).

  • PDF

A Robust Process Capability Index based on EDF Expected Loss (EDF 기대손실에 기초한 로버스트 공정능력지수)

  • 임태진;송현석
    • Journal of Korean Society for Quality Management
    • /
    • v.31 no.1
    • /
    • pp.109-122
    • /
    • 2003
  • This paper presents a robust process capability index(PCI) based on the expected loss derived from the empirical distribution function(EDF). We propose the EDF expected loss in order to develop a PCI that does not depends on the underlying process distribution. The EDF expected loss depends only on the sample data, so the PCI based on it is robust and it does nor require complex calculations. The inverted normal loss function(INLF) is employed in order to overcome the drawback of the quadratic loss which may Increase unboundedly outside the specification limits. A comprehensive simulation study was performed under various process distributions, in order to compare the accuracy and the precision of the proposed PCI with those of the PCI based on the expected loss derived from the normal distribution. The proposed PCI turned out to be more accurate than the normal PCI in most cases, especially when the process distribution has high kurtosis or skewness. It is expected that the proposed PCI can be utilized In real processes where the true distribution family may not be known.

Voice Activity Detection employing the Generalized Normal-Laplace Distribution (일반화된 정규-라플라스 분포를 이용한 음성검출기)

  • Kim, Sang-Kyun;Kwon, Jang-Woo;Lee, Sangmin
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.3
    • /
    • pp.294-299
    • /
    • 2014
  • In this paper, we propose a novel algorithm to improve the performance of a voice activity detection(VAD) which is based on the generalized normal-Laplace(GNL) distribution. In our algorithm, the probability density function(PDF) of the noisy speech signal is represented by the GNL distribution and the variance of the speech and noise of GNL distribution are estimated using higher order moments. Experimental results show that the proposed algorithm yields better results compared to the conventional VAD algorithms.

On the maximum likelihood estimation for a normal distribution under random censoring

  • Kim, Namhyun
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.6
    • /
    • pp.647-658
    • /
    • 2018
  • In this paper, we study statistical inferences on the maximum likelihood estimation of a normal distribution when data are randomly censored. Likelihood equations are derived assuming that the censoring distribution does not involve any parameters of interest. The maximum likelihood estimators (MLEs) of the censored normal distribution do not have an explicit form, and it should be solved in an iterative way. We consider a simple method to derive an explicit form of the approximate MLEs with no iterations by expanding the nonlinear parts of the likelihood equations in Taylor series around some suitable points. The points are closely related to Kaplan-Meier estimators. By using the same method, the observed Fisher information is also approximated to obtain asymptotic variances of the estimators. An illustrative example is presented, and a simulation study is conducted to compare the performances of the estimators. In addition to their explicit form, the approximate MLEs are as efficient as the MLEs in terms of variances.

Improve the Performance of Semi-Supervised Side-channel Analysis Using HWFilter Method

  • Hong Zhang;Lang Li;Di Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.738-754
    • /
    • 2024
  • Side-channel analysis (SCA) is a cryptanalytic technique that exploits physical leakages, such as power consumption or electromagnetic emanations, from cryptographic devices to extract secret keys used in cryptographic algorithms. Recent studies have shown that training SCA models with semi-supervised learning can effectively overcome the problem of few labeled power traces. However, the process of training SCA models using semi-supervised learning generates many pseudo-labels. The performance of the SCA model can be reduced by some of these pseudo-labels. To solve this issue, we propose the HWFilter method to improve semi-supervised SCA. This method uses a Hamming Weight Pseudo-label Filter (HWPF) to filter the pseudo-labels generated by the semi-supervised SCA model, which enhances the model's performance. Furthermore, we introduce a normal distribution method for constructing the HWPF. In the normal distribution method, the Hamming weights (HWs) of power traces can be obtained from the normal distribution of power points. These HWs are filtered and combined into a HWPF. The HWFilter was tested using the ASCADv1 database and the AES_HD dataset. The experimental results demonstrate that the HWFilter method can significantly enhance the performance of semi-supervised SCA models. In the ASCADv1 database, the model with HWFilter requires only 33 power traces to recover the key. In the AES_HD dataset, the model with HWFilter outperforms the current best semi-supervised SCA model by 12%.

A Study on the Analysis of Traffic Distribution and Traffic Pattern on Traffic Route using ND-K-S (ND-K-S를 적용한 항로 통항분포와 통항패턴 분석에 관한 연구)

  • Kim, Jong-Kwan
    • Journal of Navigation and Port Research
    • /
    • v.42 no.6
    • /
    • pp.446-452
    • /
    • 2018
  • A traffic route is an area associated with high risk for accidents due to the flow of heavy traffic. Despite this concern, most studies related to traffic focus solely on traffic distribution. Therefore, there is a need for studies investigating the characteristics of ships' routes and traffic patterns. In this study, an investigation was carried out to analyze the traffic distribution and pattern in 3 major traffic routes for 3 days. For the purpose of the study, based on the prevailing traffic conditions, the route was divided into 10 gate lines. The ships passing through the lines were also classified into either small, medium and large. ND-K-S (normal distribution, kurtosis, and skewness) test was carried out for the traffic distribution at each gate line based on the information analyzed on each traffic route. The analysis of the results obtained from the ND test showed that large vessels have normal distribution, medium sized vessels have satisfied normal distribution in one-way route only while small sized vessels do not have normal distribution. According to the result obtained from the K-S test, normal traffic pattern shows a significant difference between two-way route and one-way route. Results obtained from the K test result shows that in the case of one-way route, vessels have a traffic pattern using a wide range on traffic route. Further analysis shows that vessels concentrate on one side of route in case of two-way route. Results obtained from the S test show that, in case of one-way route, vessels have a normal traffic pattern according to center line. However, analysis pf the results shows that vessels are shifted to the right side of route in case of two-way route. Despite these findings, it should be noted that this study was carried out in only 3 ports, therefore there is need for investigation to be carried out in various routes and conditions in future studies.

Study on Statistical Distributions for the Mechanical Properties of Thinning Crop-Trees from Pinus koraiensis (잣나무 간벌재(間伐材)의 기계적(機械的) 성질(性質)에 대(對)한 이론적(理論的) 통계(統計) 분포(分布) 연구(硏究))

  • Cha, Jae-Kyung
    • Journal of the Korean Wood Science and Technology
    • /
    • v.21 no.4
    • /
    • pp.55-59
    • /
    • 1993
  • 한국의 중서부 지역에서 주로 벌채한 잣나무 간벌 제재목을 경기도 광주 소재 제재소에서 무작위로 추출하여 구입하였다. 본 연구는 휨강도 시험을 표준 시험 방법에 의하여 실시하였다. 각 무결점 시편으로부터 측정한 영 계수와 휨 강도에 대하여 이론적 통계 분포인 정상 분포, Log-normal 분포, Weibull 분포를 계산하여 비교하였다. Weibull 분포가 휨영계수 및 휨강도 모두에 적합하였으며, Log-normal 분포는 영계수 분포에 대한 이용에 적합하였다. 휨강도 분포에서는 Normal 분포가 Log-normal 분포보다 적합하다.

  • PDF