• Title/Summary/Keyword: 제2종 오류

Search Result 30, Processing Time 0.021 seconds

Developing of Exact Tests for Order-Restrictions in Categorical Data (범주형 자료에서 순서화된 대립가설 검정을 위한 정확검정의 개발)

  • Nam, Jusun;Kang, Seung-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.4
    • /
    • pp.595-610
    • /
    • 2013
  • Testing of order-restricted alternative hypothesis in $2{\times}k$ contingency tables can be applied to various fields of medicine, sociology, and business administration. Most testing methods have been developed based on a large sample theory. In the case of a small sample size or unbalanced sample size, the Type I error rate of the testing method (based on a large sample theory) is very different from the target point of 5%. In this paper, the exact testing method is introduced in regards to the testing of an order-restricted alternative hypothesis in categorical data (particularly if a small sample size or extreme unbalanced data). Power and exact p-value are calculated, respectively.

Comparison of Single Imputation Methods in 2×2 Cross-Over Design with Missing Observations (2×2 교차계획법에서 결측치가 있을 때의 결측치 처리 방법 비교에 관한 연구)

  • Jo, Bobae;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.529-540
    • /
    • 2015
  • A cross-over design is frequently used in clinical trials (especially in bioequivalence tests with a parametric method) for the comparison of two treatments. Missing values frequently take place in cross-over designs in the second period. Usually, subjects that have missing values are removed and analyzed. However, it can be unsuitable in clinical trials with a small sample size. In this paper, we compare single imputation methods in a $2{\times}2$ cross-over design when missing values exist in the second period. Additionally, parametric and nonparametric methods are compared after applying single imputation methods. A Monte-Carlo simulation study compares type I error and the power of methods.

A Comparison of Reduction of Dental Plaque Control and Oral Malodor according to Hardness of Detergent Food (일부 청정식품의 경도 차이에 따른 치면세균막 제거 및 구취감소 효과 비교)

  • Kim, Min-Ji
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.9
    • /
    • pp.324-330
    • /
    • 2017
  • The aim of this study was to make a comparison of dental plaque control and reduction of oral malodor according to hardness of detergent food. Subjects are 1 male(5.0%) and 19 females(95. 0%), the average age of 20.8 years old. The study was conducted from March 6 to April 24, 2014. Detergent foods which were selected during this experiment were cucumber, cabbage and tomato. The data were analyzed by using SPSS where the PHP Index, plaque rate, $H_2S$, $(CH_3)_2S$, Oral Gas, Expiration Gas were analyzed by Non-parametric Statistics and it was compared to the results of the compared mean whereas factors of detergent food before and after ingestion were analyzed by paired t-test. With all detergent foods, compared with the degree of control of dental plaque before and after ingestion showed a statistically significant difference between PHP index from cucumber, PHP index and plaque rate from tomato, and plaque rate from cabbage.

A simulation comparison on the analysing methods of Likert type data (모의실험에 의한 리커트형 설문분석 방법의 비교)

  • Kim, Hyun Chul;Choi, Seung Kyoung;Choi, Dong Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.373-380
    • /
    • 2016
  • Even though Likert type data is ordinal scale, many researchers who regard Likert type data as interval scale adapt as parametric methods. In this research, simulations have been used to find out a proper analysis of Likert type data. The locations and response distributions of five point Likert type data samples having diverse distribution have been evaluated. In estimating samples' locations, we considered parametric method and non-parametric method, which are t-test and Mann-Whitney test respectively. In addition, to test response distribution, we employed Chi-squared test and Kolmogorov-Smirnov test. In this study, we assessed the performance of the four aforementioned methods by comparing Type I error ratio and statistical power.

A Bibliographical and Literary Research on the Xinxu(新序) of the Published edition in Joseon (조선간본(朝鮮刊本) 『유향신서(劉向新序)』의 서지·문헌 연구)

  • You, Sueng-hyun;Min, Kuan-dong
    • Cross-Cultural Studies
    • /
    • v.51
    • /
    • pp.257-257
    • /
    • 2018
  • Xinxu(新序) was published in Korea by 1492. Among the existing editions, the editions that can confirm the realities are the collections of Keimyung University, the Korean Studies Central Research Institute, Kyonggi University, Hujodang(後彫堂), and the National Assembly Library of Japan. The Keimyung University's precious book is the 'first published book', and the old book is the 'later published book' which covers pages 69-70 and 71-72 of the first published book. It is the 'later published book' that has the same side inscribed. The second books, the Central Research Institute of Korea Studies and the Kyonggi University Collection are the first published books, and the Hujodang and the National Assembly Library of Japan are on pages 9-10, 63-64, 87-88, 107-108. The corresponding side is the 'later published book'. Comparing the editions, it can be concluded that the existing editions of the previous editions have been withdrawn two times, and in the latter editions, the existing editions of four editions can also be confirmed to have been edited three times. In this paper, the literature based on the existing editions was studied and features of the Korean edition were presented. First, we examine the types of paragraphs. In principle, the text is composed of '11 lines and 18 characters', but on the actual version, the number of characters is shown in the table. In the Korean edition of the Joseon dynasty, a blank space appears in the original text. The erroneous letter in the Joseon book was identified the reason for the error was explained in detail.

Statistical methods for testing tumor heterogeneity (종양 이질성을 검정을 위한 통계적 방법론 연구)

  • Lee, Dong Neuck;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.3
    • /
    • pp.331-348
    • /
    • 2019
  • Understanding the tumor heterogeneity due to differences in the growth pattern of metastatic tumors and rate of change is important for understanding the sensitivity of tumor cells to drugs and finding appropriate therapies. It is often possible to test for differences in population means using t-test or ANOVA when the group of N samples is distinct. However, these statistical methods can not be used unless the groups are distinguished as the data covered in this paper. Statistical methods have been studied to test heterogeneity between samples. The minimum combination t-test method is one of them. In this paper, we propose a maximum combinatorial t-test method that takes into account combinations that bisect data at different ratios. Also we propose a method based on the idea that examining the heterogeneity of a sample is equivalent to testing whether the number of optimal clusters is one in the cluster analysis. We verified that the proposed methods, maximum combination t-test method and gap statistic, have better type-I error and power than the previously proposed method based on simulation study and obtained the results through real data analysis.

Improved Preservation Methods for Big and Old Tress in South Korea (우리 나라의 노거수자원(老巨樹資源) 보호관리실태(保護管理室態) 및 개선방안(改善方案))

  • Park, Chong-Min;Seo, Byun-Soo;Lee, Cheong-Taek
    • Journal of Korean Society of Forest Science
    • /
    • v.89 no.3
    • /
    • pp.440-451
    • /
    • 2000
  • This study was conducted in order to provide essential data and relevant management proposal to conserve and maintain big and old trees in a rational way. For the field survey, 77 big and old trees preserved by the laws in Chollabuk-do, Korea were investigated. The study results are summarized as follows : 1. To conserve and manage big and old trees, the valuable trees have been designated as natural monument trees and protection-needed trees. There are 141 individuals of 37 species designated as natural monuments and 10,049 individuals of 102 species designated as protection-needed trees. 2. Management budget for natural monument trees was devoted at 70% from the national expenditure, but that for protection-needed trees was devoted at 98% from the local expenditure. 3. Standardized sign boards and sign stones for natural monument trees were well placed and other protection facilities such as fences, branch supports and branch holdings were established. On the other hand, management of protection-needed trees was deficient overall. 4. Problems for designation process and management of protection-needed trees could include items such as insufficient management budget, various development activities, land ownership, misjudgement of tree age and species identification, unsatisfaction of sign board placement, insufficient surgery for damaged trees, pavement around tree root system and environmental pollution around the trees. 5. In order to improve the existing management methods of big and old trees, the following schemes were suggested : the development of practical criteria for natural monument and protection-needed trees, nationwide surveys of big and old tree resources, the security of national budget, securing sufficient spaces for the tree growth, specialization of management systems, extended practices of tree form management, establishment of permanent standard signs and consideration of opinions of village residents.

  • PDF

Prolyl Endopeptidase-inhibiting Isoflavonoids from Puerariae Flos and Some Revision of their $^{13}C-NMR$ Assignment (갈화의 Prolyl Endopeptidase 저해 활성 Isoflavonoid 및 이들의 $^{13}C-NMR$ Assignment)

  • Kim, Kyung-Bum;Kim, Sang-In;Kim, Jong-Sik;Song, Kyung-Sik
    • Applied Biological Chemistry
    • /
    • v.42 no.4
    • /
    • pp.351-355
    • /
    • 1999
  • In order to find anti-dementia drugs from natural products, prolyl endopeptidase inhibitors were purified from Puerariae Flos by consecutive solvent partition, followed by silica gel, Sephadex LH-20, and HPLC. Four isoflavonoid inhibitors were isolated and identified as tectorigenin, genistein, 5,7-dihydroxy-4',6-dimethoxyisoflavone, and 5-hydroxy-6,7,4'-trimethoxyisoflavone by means of instrumental analyses including $^{1}H-$, $^{13}C-$, $^{2}D-NMR$ and MS and $IC_{50}$ values against PEP were 5.30 ppm$(17.7\;{\mu}M)$, 10.39 ppm$(38.5\;{\mu}M)$, 13.92 ppm$(44.3\;{\mu}M)$, and 20.61 ppm$(62.8\;{\mu}M)$, respectively. Some previous mistakes in $^{13}C-NMR$ assignment were revised by careful investigation of HMBC and HMQC data.

  • PDF

Problem Structuring in IT Policy: Boundary Analysis of IT Policy Problems (경계분석을 통한 정책문제 정의에 관한 연구 - 언론보도에 나타난 IT 정책문제 탐색을 중심으로 -)

  • Park, Chisung;Nam, Ki Bum
    • 한국정책학회보
    • /
    • v.21 no.4
    • /
    • pp.199-228
    • /
    • 2012
  • Policy problems are complex due to diverse participants and their relations in the policy processes. Defining the right problem in the first place is important because Type III error is likely to happen without removing rival hypothesis in defining the problem. This study applies Boundary Analysis suggested by Dunn to structure IT policy problems in Korea. The time frame of the study focuses on 5 years of Lee Administration and data are collected from four newspapers. Using content analysis, the study, first, elaborates total 2,614 policy problems from 1,908 stakeholders. After removing duplicating problems, 369 problems from 323 stakeholders are identified as a boundary of IT policy problem. Among others, failures in government policies are weighted as the most serious problems in IT policy field. However, many significant problems raised by stakeholders dated back to more than a decade, and those are intrinsic problems, which initially caused by market distortions in the IT industry. Therefore, we should be cautious not to overemphasize the most conspicuous problem as the only problem in the policy field when we interpret results of problem structuring.

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.