Search | Korea Science

Mutual Information and Redundancy for Categorical Data

Hong, Chong-Sun;Kim, Beom-Jun
- Communications for Statistical Applications and Methods
- /
- v.13 no.2
- /
- pp.297-307
- /
- 2006
Most methods for describing the relationship among random variables require specific probability distributions and some assumptions of random variables. The mutual information based on the entropy to measure the dependency among random variables does not need any specific assumptions. And the redundancy which is a analogous version of the mutual information was also proposed. In this paper, the redundancy and mutual information are explored to multi-dimensional categorical data. It is found that the redundancy for categorical data could be expressed as the function of the generalized likelihood ratio statistic under several kinds of independent log-linear models, so that the redundancy could also be used to analyze contingency tables. Whereas the generalized likelihood ratio statistic to test the goodness-of-fit of the log-linear models is sensitive to the sample size, the redundancy for categorical data does not depend on sample size but its cell probabilities itself.
https://doi.org/10.5351/CKSS.2006.13.2.297 인용 PDF KSCI

Slope Stability Analysis Considering Multi Failure Mode (다중파괴모드를 고려한 사면안정해석)

Kim, Hyun-Ki;Kim, Soo-Sam
- Journal of the Korean Society for Railway
- /
- v.14 no.1
- /
- pp.24-30
- /
- 2011
Conventional slope stability analysis is focused on calculating minimum factor of safety or maximum probability of failure. To minimize inherent uncertainty of soil properties and analytical model and to reflect various analytical models and its failure shape in slope stability analysis, slope stability analysis method considering simultaneous failure probability for multi failure mode was proposed. Linear programming recently introduced in system reliability analysis was used for calculation of simultaneous failure probability. System reliability analysis for various analytical models could be executed by this method. For application analysis for embankment, the results of this method shows that system stability of embankment calculate quantitatively.
https://doi.org/10.7782/JKSR.2011.14.1.24 인용 PDF KSCI

Rectifying Inspection of Linear Cost Model with a Constraint and a $\alpha$-Optimal Acceptance Sampling (제약조건과 사전확률이 고려된 선형비용모형의 수정검사정책)

이도경;이근희
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.14 no.24
- /
- pp.1-5
- /
- 1991
Various linear cost models have been proposed that can be used to determine a sampling plan by attributes. This paper is concerned with this sampling cost model when the probability that the number of nonconforming item is smaller than the break-even quality level is known. In addition to this situation, a constraint by AOQL is considered. Under these conditions, optimal sampling plan which minimize the average cost per lot is suggested.
PDF

Optimum Design of a Simple Slope considering Multi Failure Mode (다중 파괴모드를 고려한 단순 사면의 최적 설계)

Kim, Hyun-Ki;Shin, Min-Ho;Choi, Chan-Yong
- Journal of the Korean Society of Hazard Mitigation
- /
- v.10 no.6
- /
- pp.73-80
- /
- 2010
Conventional slope stability analysis is focused on calculating minimum factor of safety or maximum probability of failure. To minimize inherent uncertainty of soil properties and analytical model and to reflect various analytical models and its failure shape in slope stability analysis, slope stability analysis method considering simultaneous failure probability for multi failure mode was proposed. Linear programming recently introduced in system reliability analysis was used for calculation of simultaneous failure probability. System reliability analysis for various analytical models could be executed by this method. Optimum design to determine angle of a simple slope is executed for multi failure mode using linear programming. Because of complex consideration for various failure shapes and modes, it is possible to secure advanced safety by using simultaneous failure probability.
PDF KSCI

Unbiasedness or Statistical Efficiency： Comparison between One-stage Tobit of MLE and Two-step Tobit of OLS

Park, Sun-Young
- International Journal of Human Ecology
- /
- v.4 no.2
- /
- pp.77-87
- /
- 2003
This paper tried to construct statistical and econometric models on the basis of economic theory in order to discuss the issue of statistical efficiency and unbiasedness including the sample selection bias correcting problem. Comparative analytical tool were one stage Tobit of Maximum Likelihood estimation and Heckman's two-step Tobit of Ordinary Least Squares. The results showed that the adequacy of model for the analysis on demand and choice, we believe that there is no big difference in explanatory variables between the first selection model and the second linear probability model. Since the Lambda, the self- selectivity correction factor, in the Type II Tobit is not statistically significant, there is no self-selectivity in the Type II Tobit model, indicating that Type I Tobit model would give us better explanation in the demand for and choice which is less complicated statistical method rather than type II model.
PDF KSCI

Application of multiple linear regression and artificial neural network models to forecast long-term precipitation in the Geum River basin (다중회귀모형과 인공신경망모형을 이용한 금강권역 강수량 장기예측)

Kim, Chul-Gyum;Lee, Jeongwoo;Lee, Jeong Eun;Kim, Hyeonjun
- Journal of Korea Water Resources Association
- /
- v.55 no.10
- /
- pp.723-736
- /
- 2022
In this study, monthly precipitation forecasting models that can predict up to 12 months in advance were constructed for the Geum River basin, and two statistical techniques, multiple linear regression (MLR) and artificial neural network (ANN), were applied to the model construction. As predictor candidates, a total of 47 climate indices were used, including 39 global climate patterns provided by the National Oceanic and Atmospheric Administration (NOAA) and 8 meteorological factors for the basin. Forecast models were constructed by using climate indices with high correlation by analyzing the teleconnection between the monthly precipitation and each climate index for the past 40 years based on the forecast month. In the goodness-of-fit test results for the average value of forecasts of each month for 1991 to 2021, the MLR models showed -3.3 to -0.1% for the percent bias (PBIAS), 0.45 to 0.50 for the Nash-Sutcliffe efficiency (NSE), and 0.69 to 0.70 for the Pearson correlation coefficient (r), whereas, the ANN models showed PBIAS -5.0~+0.5%, NSE 0.35~0.47, and r 0.64~0.70. The mean values predicted by the MLR models were found to be closer to the observation than the ANN models. The probability of including observations within the forecast range for each month was 57.5 to 83.6% (average 72.9%) for the MLR models, and 71.5 to 88.7% (average 81.1%) for the ANN models, indicating that the ANN models showed better results. The tercile probability by month was 25.9 to 41.9% (average 34.6%) for the MLR models, and 30.3 to 39.1% (average 34.7%) for the ANN models. Both models showed long-term predictability of monthly precipitation with an average of 33.3% or more in tercile probability. In conclusion, the difference in predictability between the two models was found to be relatively small. However, when judging from the hit rate for the prediction range or the tercile probability, the monthly deviation for predictability was found to be relatively small for the ANN models.
https://doi.org/10.3741/JKWRA.2022.55.10.723 인용 PDF KSCI

Complex Segregation Analysis of Categorical Traits in Farm Animals: Comparison of Linear and Threshold Models

Kadarmideen, Haja N.;Ilahi, H.
- Asian-Australasian Journal of Animal Sciences
- /
- v.18 no.8
- /
- pp.1088-1097
- /
- 2005
Main objectives of this study were to investigate accuracy, bias and power of linear and threshold model segregation analysis methods for detection of major genes in categorical traits in farm animals. Maximum Likelihood Linear Model (MLLM), Bayesian Linear Model (BALM) and Bayesian Threshold Model (BATM) were applied to simulated data on normal, categorical and binary scales as well as to disease data in pigs. Simulated data on the underlying normally distributed liability (NDL) were used to create categorical and binary data. MLLM method was applied to data on all scales (Normal, categorical and binary) and BATM method was developed and applied only to binary data. The MLLM analyses underestimated parameters for binary as well as categorical traits compared to normal traits; with the bias being very severe for binary traits. The accuracy of major gene and polygene parameter estimates was also very low for binary data compared with those for categorical data; the later gave results similar to normal data. When disease incidence (on binary scale) is close to 50%, segregation analysis has more accuracy and lesser bias, compared to diseases with rare incidences. NDL data were always better than categorical data. Under the MLLM method, the test statistics for categorical and binary data were consistently unusually very high (while the opposite is expected due to loss of information in categorical data), indicating high false discovery rates of major genes if linear models are applied to categorical traits. With Bayesian segregation analysis, 95% highest probability density regions of major gene variances were checked if they included the value of zero (boundary parameter); by nature of this difference between likelihood and Bayesian approaches, the Bayesian methods are likely to be more reliable for categorical data. The BATM segregation analysis of binary data also showed a significant advantage over MLLM in terms of higher accuracy. Based on the results, threshold models are recommended when the trait distributions are discontinuous. Further, segregation analysis could be used in an initial scan of the data for evidence of major genes before embarking on molecular genome mapping.
https://doi.org/10.5713/ajas.2005.1088 인용 PDF KSCI

Evaluation of seismic fragility models for cut-and-cover railway tunnels (개착식 철도 터널 구조물의 기존 지진취약도 모델 적합성 평가)

Yang, Seunghoon;Kwak, Dongyoup
- Journal of Korean Tunnelling and Underground Space Association
- /
- v.24 no.1
- /
- pp.1-13
- /
- 2022
A weighted linear combination of seismic fragility models previously developed for cut-and-cover railway tunnels was presented and the appropriateness of the combined model was evaluated. The seismic fragility function is expressed in the form of a cumulative probability function of the lognormal distribution based on the peak ground acceleration. The model uncertainty can be reduced by combining models independently developed. Equal weight is applied to four models. The new seismic fragility function was developed for each damage level by determining the median and standard deviation, which are model metrics. Comparing fragility curves developed for other bored tunnels, cut-and-cover tunnels for high-speed railway system have a similar level of fragility. We postulated that this is due to the high seismic design standard for high-speed railway tunnel.
https://doi.org/10.9711/KTAJ.2022.24.1.001 인용 PDF KSCI

Prediction of Future Sea Surface Temperature around the Korean Peninsular based on Statistical Downscaling (통계적 축소법을 이용한 한반도 인근해역의 미래 표층수온 추정)

Ham, Hee-Jung;Kim, Sang-Su;Yoon, Woo-Seok
- Journal of Industrial Technology
- /
- v.31 no.B
- /
- pp.107-112
- /
- 2011
Recently, climate change around the world due to global warming has became an important issue and damages by climate change have a bad effect on human life. Changes of Sea Surface Temperature(SST) is associated with natural disaster such as Typhoon and El Nino. So we predicted daily future SST using Statistical Downscaling Method and CGCM 3.1 A1B scenario. 9 points of around Korea peninsular were selected to predict future SST and built up a regression model using Multiple Linear Regression. CGCM 3.1 was simulated with regression model, and that comparing Probability Density Function, Box-Plot, and statistical data to evaluate suitability of regression models, it was validated that regression models were built up properly.
PDF

Wedge Failure Probability Analysis for Rock Slope Based on Non-linear Shear Strength of Discontinuity (불연속면의 비선형 전단강도를 이용한 암반사면 쐐기파괴 확률 해석)

윤우현;천병식
- Journal of the Korean Geotechnical Society
- /
- v.19 no.6
- /
- pp.151-160
- /
- 2003
The stability of the designed rock slope is analysed based on two kinds of shear strength model. Besides the deterministic analysis, a probabilistic approach on Monte Carlo simulation is proposed to deal with the uncertain characteristics of the discontinuity and the results obtained from two models are compared to each other. To carry out the research of characteristics of the discontinuity, BIPS, DOM Scanline survey data and direct shear test data are used, and chi-square test is used for determining the probability distribution function. The rock slope is evaluated to be stable in the deterministic analysis, but in the probabilistic analysis, the probability of failure is more than 5%, so, it is considered that the rock slope is unstable. In the shear strength models, the probability of the failure based on the Mohr-Coulomb model(linear model) is higher than that of the Barton model. It is supported by the fact that the Mohr-Coulomb model is more sensitive to block size than the Barton model. In fact, there is no reliable way to estimate the unit cohesion of the Mohr-Coulomb model except f3r back analysis and in the case of small block failure in the slope, Mohr-Coulomb model may excessively evaluate the factor of the safety. So, the Barton model of which parameters are easily acquired using the geological survey is more reasonable for the stability of the studied slope. Also, the selection of the proper shear strength model is an important factor for slope failure analysis.
PDF KSCI

Search Result 94, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)