• Title/Summary/Keyword: contingency table

Search Result 118, Processing Time 0.024 seconds

Forecasting of Seasonal Inflow to Reservoir Using Multiple Linear Regression (다중선형회귀분석에 의한 계절별 저수지 유입량 예측)

  • Kang, Jaewon
    • Journal of Environmental Science International
    • /
    • v.22 no.8
    • /
    • pp.953-963
    • /
    • 2013
  • Reliable long-term streamflow forecasting is invaluable for water resource planning and management which allocates water supply according to the demand of water users. Forecasting of seasonal inflow to Andong dam is performed and assessed using statistical methods based on hydrometeorological data. Predictors which is used to forecast seasonal inflow to Andong dam are selected from southern oscillation index, sea surface temperature, and 500 hPa geopotential height data in northern hemisphere. Predictors are selected by the following procedure. Primary predictors sets are obtained, and then final predictors are determined from the sets. The primary predictor sets for each season are identified using cross correlation and mutual information. The final predictors are identified using partial cross correlation and partial mutual information. In each season, there are three selected predictors. The values are determined using bootstrapping technique considering a specific significance level for predictor selection. Seasonal inflow forecasting is performed by multiple linear regression analysis using the selected predictors for each season, and the results of forecast using cross validation are assessed. Multiple linear regression analysis is performed using SAS. The results of multiple linear regression analysis are assessed by mean squared error and mean absolute error. And contingency table is established and assessed by Heidke skill score. The assessment reveals that the forecasts by multiple linear regression analysis are better than the reference forecasts.

Goodness-of-fit tests for a proportional odds model

  • Lee, Hyun Yung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1465-1475
    • /
    • 2013
  • The chi-square type test statistic is the most commonly used test in terms of measuring testing goodness-of-fit for multinomial logistic regression model, which has its grouped data (binomial data) and ungrouped (binary) data classified by a covariate pattern. Chi-square type statistic is not a satisfactory gauge, however, because the ungrouped Pearson chi-square statistic does not adhere well to the chi-square statistic and the ungrouped Pearson chi-square statistic is also not a satisfactory form of measurement in itself. Currently, goodness-of-fit in the ordinal setting is often assessed using the Pearson chi-square statistic and deviance tests. These tests involve creating a contingency table in which rows consist of all possible cross-classifications of the model covariates, and columns consist of the levels of the ordinal response. I examined goodness-of-fit tests for a proportional odds logistic regression model-the most commonly used regression model for an ordinal response variable. Using a simulation study, I investigated the distribution and power properties of this test and compared these with those of three other goodness-of-fit tests. The new test had lower power than the existing tests; however, it was able to detect a greater number of the different types of lack of fit considered in this study. I illustrated the ability of the tests to detect lack of fit using a study of aftercare decisions for psychiatrically hospitalized adolescents.

Multifactor-Dimensionality Reduction in the Presence of Missing Observations

  • Chung, Yu-Jin;Lee, Seung-Yeoun;Park, Tae-Sung
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.11a
    • /
    • pp.31-36
    • /
    • 2005
  • An identification and characterization of susceptibility genes for common complex multifactorial diseases is a challengeable task, in which the effect of single genetic variation will be likely dependent on other genetic variations(gene-gene interaction) and environmental factors (gene-environment interaction). To address is issue, the multifactor dimensionality reduction (MDR) has been proposed and implemented by Ritchie et al. (2001), Moore et al. (2002), Hahn et al.(2003) and Ritchie et al. (2003). With MDR, multilocus genotypes effectively reduce the dimension of genotype predictors from n to one, which improves the identification of polymorphism combinations associated with disease risk. However, MDR cannot handle missing observations appropriately, in which missing observation is treated as an additional genotype category. This approach may suffer from a sparseness problem since when high-order interactions are considered, an additional missing category would make the contingency table cells more sparse. We propose a new MDR approach with minimum loss of sample sizes by considering missing data over all possible multifactor classes. We evaluate the proposed MDR by using the prediction errors and cross validation consistency.

  • PDF

The Effect of Meteorological Information on Business Decision-Making with a Value Score Model (가치스코어 모형을 이용한 기상정보의 기업 의사결정에 미치는 영향 평가)

  • Lee, Ki-Kwang;Lee, Joong-Woo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.30 no.2
    • /
    • pp.89-98
    • /
    • 2007
  • In this paper the economic value of weather forecasts is valuated for profit-oriented enterprise decision-making situations. Value is estimated in terms of monetary profits (or benefits) resulted from the forecast user's decision under the specific payoff structure, which is represented by a profit/loss ratio model combined with a decision function and a value score (VS). The forecast user determines a business-related decision based on the probabilistic forecast, the user's subjective reliability of the forecasts, and the payoff structure specific to the user's business environment. The VS curve for a meteorological forecast is specified by a function of the various profit/loss ratios, providing the scaled economic value relative to the value of a perfect forecast. The proposed valuation method based on the profit/loss ratio model and the VS is adapted for hypothetical sets of forecasts and verified for site-specific probability of precipitation forecast of 12 hour and 24 hour-lead time, which is generated from Korea meteorological administration (KMA). The application results show that forecast information with shorter lead time can provide the decision-makers with great benefits and there are ranges of profit/loss ratios in which high subjective reliability of the given forecast is preferred.

An Application of Canonical Analysis on the Distribution of Lichens in Mt. Duckyuoo (덕유산 지의식물 분포에 대한 정준분석법의 적용연구)

  • Park, Seung Tai
    • The Korean Journal of Ecology
    • /
    • v.9 no.3
    • /
    • pp.135-147
    • /
    • 1986
  • The simplification and the searching trends of complex data which assumed relationship between predictor variables and object variables are one of primary objective of ecological research. This study was aimed to apply cononical analysis consisting of canonical correlation analysis and canonical variate analysis related to lichen vegetation and several environmental variables which are elevation, height on grond, exposure side and cover values. Data collected from the Duckyoo National Park in August 1985. Lichen species was ranked by eqivocation information theory with cover values. Canonical correlation analysis was applied to one data set both set both environmental variables and lichem family. In order to make two sets of data matrix the scale of position vector ordination was calculated from the vector scalar product for lichen species. Canonical variate analysis was applied to rearranged data which was made by interval class code for environmental variables. The sharpness values was calculated in frequency of cotingency tables and the dispersion profiles of each species in classes of environmental variables was designed to extract component values based on the decomposition of expected frequencies in contingency table. The results of canonical correlation analysis revealed canonical first correlation value 0.815(89%), and second correlation value 0.083(11%). Significance test showed that the hypothesis of joint mutuallity of canonical correlation is accepted (P>0.05). The relation between canonical score of vegetation variables and that of environmental variable indicated linear tendency.

  • PDF

Asymptotic Inference on the Odds Ratio via Saddlepoint Method (안부점근사를 이용한 승산비에 대한 점근적 추론)

  • Na, Jong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.1
    • /
    • pp.29-36
    • /
    • 1999
  • We propose a new method of asymptotic inference on the odds ratio (or cross-product ratio) in $2{\times}2$ contingency table. Saddlepoint approximations to the conditional tail probability we used in this procedure. We assess the accuracy of the suggested method by comparing with the exact one. To obtain the exact values, we need very complicated calculations containing the cumulative probabilities of non-central hypergeometric distribution. The suggested method in this paper is very accurate even for small or moderate sample sizes as well as simple and easy to use. Example with a real data is also considered.

  • PDF

Small diagnostic scale for internet addiction (인터넷 중독 자가진단 소형척도 개발)

  • Oh, Kwang-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.6
    • /
    • pp.1203-1209
    • /
    • 2010
  • Internet addiction is a serious social problem in information society. The purpose of this study is to develope a small diagnostic scale in order to detect internet addiction easily. The reliability and validity of K-scale and Kimberly Young-scale is investigated. Five small diagnostic scale is suggested by factor analysis and regression. The comparision of these small scale is established by correlation coefficient, chi-square test, gamma value of concordance in contingency table. In view of reliability and validity, we suggest a small diagnostic scale. The results of this study may be useful to detect internet addiction by oneself.

Quantitative Assessment of Input and Integrated Information in GIS-based Multi-source Spatial Data Integration: A Case Study for Mineral Potential Mapping

  • Kwon, Byung-Doo;Chi, Kwang-Hoon;Lee, Ki-Won;Park, No-Wook
    • Journal of the Korean earth science society
    • /
    • v.25 no.1
    • /
    • pp.10-21
    • /
    • 2004
  • Recently, spatial data integration for geoscientific application has been regarded as an important task of various geoscientific applications of GIS. Although much research has been reported in the literature, quantitative assessment of the spatial interrelationship between input data layers and an integrated layer has not been considered fully and is in the development stage. Regarding this matter, we propose here, methodologies that account for the spatial interrelationship and spatial patterns in the spatial integration task, namely a multi-buffer zone analysis and a statistical analysis based on a contingency table. The main part of our work, the multi-buffer zone analysis, was addressed and applied to reveal the spatial pattern around geological source primitives and statistical analysis was performed to extract information for the assessment of an integrated layer. Mineral potential mapping using multi-source geoscience data sets from Ogdong in Korea was applied to illustrate application of this methodology.

Investication for KSK 9403: 2004 Recognition and Mother's Preference of Female Children's Apparel (여자 아동복 구입시 어머니의 선호도 및 KSK 9403: 2004 호칭 치수 인지도 조사)

  • Koo, Hee-Kyung
    • Journal of the Korea Fashion and Costume Design Association
    • /
    • v.9 no.3
    • /
    • pp.87-97
    • /
    • 2007
  • This study is to investigate the KS size recognition and mother's preference of female children's apparel. The practical research is performed for 150 mothers lived in Seoul and are randomly selected to their age, female children's number, education and income level. For statistical analysis and evaluation of survey data, frequency and percentage use contingency table. Findings in this study as follow: 1. Mother's preference for purchasing the girl's garments shows the significant differences of their subject characteristics such as age, girl's number, education and income level. 2. Mother's recognition about KSK 9403: 2004 sizing system for girl's garments does not show the significant differences of their subject properties. Most mothers only know the part of the KS size specifications because KS sizing systems are complex. So KS sizing systems must be simplified and respecified to understand the KS for mothers easily when purchasing their girl's garments. In summary this paper investigates mother's preference and recognition about KS sizing system for the girl's garments.

  • PDF

The Reliability and Validity of Patient-Generated Subjective Global Assessment (PG-SGA) in Stroke Patients (뇌졸중 환자에서 '환자 주도적 총체적 영양사정' 도구의 신뢰도 및 타당도 평가)

  • Yoo, Sung-Hee;Oh, Eui-Guem;Youn, Mi-Jung
    • Korean Journal of Adult Nursing
    • /
    • v.21 no.6
    • /
    • pp.559-569
    • /
    • 2009
  • Purpose: This study was to examine the reliability and validity of Patient-Generated Subjective Global Assessment (PG-SGA) as a nutritional measurement for stroke patients. Methods: This was a methodological study performed from May 6 to June 10, 2009 at a tertiary university hospital in Seoul. For reliability of PG-SGA, inter-rater reliability was used for statistics. For concurrent validity, BMI and biomarkers were compared between PG-SGA 0 ~ 8 and ${\geq}$ 9. In addition, sensitivity, specificity, and predictive value of PG-SGA compared with SGA were calculated using a contingency table. For predictive validity, hospital day, complications, and readmission within 1-month after discharge were compared between PG-SGA 0 ~ 8 and ${\geq}$ 9. Results: Correlation of PG-SGA score between two observers was 0.83, and kappa value for the agreement of severe malnutrition was 0.78(all $p_s$ < .001). The scored PG-SGA showed high sensitivity and specificity (100% and 96.7%, respectively). Severe undernourished patients (PG-SGA ${\geq}$ 9) had significantly low TLC, protein, albumin, and prealbumin (all $p_s$ < .01) compared with non-undernourished patients (PG-SGA 0 ~ 8). Also, in severe undernourished patients, complications and readmission (all $p_s$ = 0.01) were more often represented, and hospital days (p = .013) were significantly delayed. Conclusion: PG-SGA is a reliable and valid measurement to assess nutritional status for stroke patients.

  • PDF