• Title/Summary/Keyword: 통계 오류

Search Result 384, Processing Time 0.036 seconds

Note on the Equality of Variances in Two Sample t-Test (두 집단 평균 차이 검정에서 분산의 동질성에 관한 소고)

  • Kim, Sang-Cheol;Lim, Jo-Han
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.79-88
    • /
    • 2010
  • Introductory statistic class proposes two tests for the equality of two population means according to the homogeneity of their variances. However, in practice, the variances are also unknown and practitioners often test their homogeneity before they do two sample t-test. This is also true in many popular statistical packages such as SAS and SPSS. In this paper, we study the type I error of this two stage procedure and propose a procedure to control it at a given significance level.

Labeled Statistical Korean Dependency Parsing with Global and Local Information (전역 및 지역 정보를 이용한 SVM 기반 한국어 문장 구조 및 격 레이블 분석)

  • Lim, Soojong;Lee, Changki;Jang, Myung-Gil;Ra, DongRyul
    • Annual Conference on Human and Language Technology
    • /
    • 2009.10a
    • /
    • pp.207-212
    • /
    • 2009
  • 한국어 문장의 구조 및 격 레이블 분석을 위해서 SVM 모델을 이용하여 얻어진 전역 및 지역 정보 통계 모델에 기반한 방법을 제안한다. 제안하는 방법은 후방 beam search 알고리즘을 이용하여 부분 구문 분석을 하는 과정에서 지역 의존 정보를 사용하였고 이렇게 구성된 문장의 후보 구조에 대해서는 전역 정보 모델를 사용하여 최적의 문장 구조 및 격 레이블을 분석하였다. 제안하는 방법은 지역이나 전역 중 한 개의 모델만을 사용할 때 발생할 수 있는 오류를 최소화하였다. 지식 DB 사업의 한국어 의존 구문 분석 말뭉치를 이용하여 실험한 결과 전역 정보나 지역 정보만을 사용한 결과보다 각각 1.2%, 3.3% 높은 79.1%의 문장 구조 및 격 레이블 분석 정확률을 나타냈고 전역 정보만을 사용할 때보다 약 76배 이상의 빠른 속도 향상을 보였다. 향후 연구로는 지배소 단위, 구 묶음 단위 등으로 통계 정보를 세분화하여 좀더 높은 성능 향상을 기대한다.

  • PDF

An Automatic Korean Word Spacing System for Devices with Low Computing Power (저사양 기기를 위한 한국어 자동 띄어쓰기 시스템)

  • Song, Yeong-Kil;Kim, Hark-Soo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.4
    • /
    • pp.333-340
    • /
    • 2009
  • Most of the previous automatic word spacing systems are not suitable to use for mobile devices with relatively low computing powers because they require many system resources. We propose an automatic word spacing system that requires reasonable memory usage and simple numerical computations for mobile devices with low computing powers. The proposed system is a two step model that consists of a statistical system and a rule-based system. To reduce the memory usage, the statistical system first corrects word spacing errors by using a modified hidden Markov model based on character unigrams. Then, to increase the accuracy, the rule-based system re-corrects miscorrected word spaces by using lexical rules based on character bigrams or more. In the experiments, the proposed system showed relatively high accuracy of 94.14% in spite of small memory usage of about 1MB.

Selection of the economically optimal parameters in the EWMA control chart (지수가중이동평균관리도의 경제적 최적모수의 선정)

  • 박창순;원태연
    • The Korean Journal of Applied Statistics
    • /
    • v.9 no.1
    • /
    • pp.91-109
    • /
    • 1996
  • Exponentially weighted moving averae(EWMA) control chart has been used widely for process monitoring and process adjustment recently, but there has not been many studies about the selection of the parameters. Design of the control chart can be classified into the statistical design and the economic design. The purpose of the economic design is to minimize the cost function in which all the possible costs occurring during the process are probability given the Type I error probability. In this paper the optimal parameters of the EWMA chart are selected for the economic design as well as for the statistical design. The optimal parameters for the economic design show significantly different from those of the statistical design, and especially the weight is always larger than that used in the statistical design. In the economic design, we divide the model into the single assignable cause model and the multiple assignable causes model caacording to number of which is used as the average context of the multiple assignable causes, it shows that the selection of the parameters may be misleading when the multiple assignable causes exist in practice.

  • PDF

An Implementation of a Lightweight Spacing-Error Correction System for Korean (한국어 경량형 띄어쓰기 교정 시스템의 구현)

  • Song, Yeong-Kil;Kim, Hark-Soo
    • The Journal of Korean Association of Computer Education
    • /
    • v.12 no.2
    • /
    • pp.87-96
    • /
    • 2009
  • We propose a Korean spacing-error correction system that requires small memory usage although the proposed method is a mixture of rule-based and statistical methods. In addition, to train the proposed model to be robust in mobile colloquial sentences in which spelling errors and omissions of functional words are frequently occurred, we propose a method to automatically transform typical colloquial corpus to mobile colloquial corpus. The proposed system uses statistical information of syllable uni-grams in order to increase coverages on new syllable patterns. Then, the proposed system uses error correction rules of two or more grams of syllables in order to increase accuracies. In the experiments on fake mobile colloquial sentences, the proposed system showed relatively high accuracy of 92.10% (93.80% in typical colloquial corpus, 94.07% in typical balanced corpus) spite of small memory usage of about 1MB.

  • PDF

The Effects of the Prescribed Instructional Strategy for Reducing Students' Connecting Errors in Learning Chemistry Concepts with Multiple External Representations (다중 표상을 활용한 화학 개념 학습에서 학생들의 연계 오류 감소를 위한 처방적인 교수 전략의 효과)

  • Kang, Hun-Sik;Kim, You-Jung;Noh, Tae-Hee
    • Journal of The Korean Association For Science Education
    • /
    • v.28 no.6
    • /
    • pp.675-684
    • /
    • 2008
  • This study investigated the effects of the prescribed instructional strategy for reducing students' connecting errors in learning chemistry concepts with multiple external representations by students' field independence-dependence. Seventh graders (N=126) at a coed middle school were assigned to control and treatment groups. The students learned "Boyle's Law" and "Charles's Law" for two class periods. Results revealed that the students in the treatment group scored significantly higher than those in the control group in a conception test. The scores of the treatment group were significantly higher than those of the control group in a motivational learning test, especially in 'attention' of the test. However, there was no significant interaction between the instruction and students' field independence-dependence in the two tests. Most students in the treatment group perceived the instruction positively in cognitive and motivational aspects.

Comparative analysis of official demographics (공식인구통계들에 대한 비교 분석)

  • 김종태
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.99-108
    • /
    • 2017
  • There are three official official demographics of the Republic of Korea: the population census, population projections, and resident population. Among these, the population projections estimates are based on population census statistics, which are conducted every five years. This study compared and analyzed the future population statistics and resident population statistics. In order to detect errors in the census process, we surveyed the outliers of demographic data. Based on these, we aimed to verify the reliability of official demographics. Resident registration demographics showed a tendency to increase as the age increased from 0 to 12 years, although the population had to decrease as the age increased. In the population projections, as the age increases from 18 to 28, a new population has developed and the population has increased. Also, in the resident population, between 2009 and 2010, in the population projections, between 2010 and 2011, there was a strange phenomenon that the population grew as a result of a new population as the age of all ages increased. Both official demographics need to be carried out through more accurate verification. Increasing the reliability of the aged population survey on the elderly population statistics will provide greater efficiency in establishing administrative policies.

Effect of abutment superimposition process of dental model scanner on final virtual model (치과용 모형 스캐너의 지대치 중첩 과정이 최종 가상 모형에 미치는 영향)

  • Yu, Beom-Young;Son, Keunbada;Lee, Kyu-Bok
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.57 no.3
    • /
    • pp.203-210
    • /
    • 2019
  • Purpose: The purpose of this study was to verify the effect of the abutment superimposition process on the final virtual model in the scanning process of single and 3-units bridge model using a dental model scanner. Materials and methods: A gypsum model for single and 3-unit bridges was manufactured for evaluating. And working casts with removable dies were made using Pindex system. A dental model scanner (3Shape E1 scanner) was used to obtain CAD reference model (CRM) and CAD test model (CTM). The CRM was scanned without removing after dividing the abutments in the working cast. Then, CTM was scanned with separated from the divided abutments and superimposed on the CRM (n=20). Finally, three-dimensional analysis software (Geomagic control X) was used to analyze the root mean square (RMS) and Mann-Whitney U test was used for statistical analysis (${\alpha}=.05$). Results: The RMS mean abutment for single full crown preparation was $10.93{\mu}m$ and the RMS average abutment for 3 unit bridge preparation was $6.9{\mu}m$. The RMS mean of the two groups showed statistically significant differences (P<.001). In addition, errors of positive and negative of two groups averaged $9.83{\mu}m$, $-6.79{\mu}m$ and 3-units bridge abutment $6.22{\mu}m$, $-3.3{\mu}m$, respectively. The mean values of the errors of positive and negative of two groups were all statistically significantly lower in 3-unit bridge abutments (P<.001). Conclusion: Although the number of abutments increased during the scan process of the working cast with removable dies, the error due to the superimposition of abutments did not increase. There was also a significantly higher error in single abutments, but within the range of clinically acceptable scan accuracy.

Analysis of the Statistical Methods used in Scientific Research published in The Korean Journal of Culinary Research (한국조리학회지에 게재된 학술적 연구의 통계적 기법 분석)

  • Rha, Young-Ah;Na, Tae-Kyun
    • Culinary science and hospitality research
    • /
    • v.21 no.6
    • /
    • pp.49-62
    • /
    • 2015
  • Give that statistical analysis is an essential component of foodservice-related research, the purpose of this review is to analyse research trends of statistical methods applied to foodservice-related research. To achieve these objective, this study carried out a content analysis on a total of 251 out of 415 research articles published in The Korean Journal of Culinary Research(TKJCR) from January 2010 to December 2013. Of the total 164 research articles focussing on natural science research, qualitative research, articles written in English were excluded from the scope of this study. The results of this study are as follows. First, it turned out that 269 research articles applied quantitative research methods, and only 10 articles applied qualitative research methods among the 279 research articles based on social science research methods. Second, 20 article (8.0%) among the 251 did not specify the statistical methods or computer programs that were used for statistical analysis. Third, it was found that 228 articles (90.8%) used the SPSS program for data analysis. Fourth, in terms of frequency of use, it was revealed frequency analysis was most used, followed in order by reliability analysis, exploratory factor analysis, correlation analysis, regression analysis, structural equation modeling, confirmatory factor analysis, t-test, variance analysis, and cross tabs analysis, However, 3 out of 56 research articles that used a t-test did not suggest a t-value. 10 out of 64 articles that used ANOVA and demonstrated a significant difference in between-group mean did not conducted post-hoc test. Therefore, the researchers with interest in foodservice fields need to keep in mind that choosing and applying the correct statistical technique both determine the value and the success or failure of a study. To enhance the value and success of a study, it is necessary to use the proper statistical technique in an efficient way in order to prevent statistical errors.

Statistical methods for testing tumor heterogeneity (종양 이질성을 검정을 위한 통계적 방법론 연구)

  • Lee, Dong Neuck;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.3
    • /
    • pp.331-348
    • /
    • 2019
  • Understanding the tumor heterogeneity due to differences in the growth pattern of metastatic tumors and rate of change is important for understanding the sensitivity of tumor cells to drugs and finding appropriate therapies. It is often possible to test for differences in population means using t-test or ANOVA when the group of N samples is distinct. However, these statistical methods can not be used unless the groups are distinguished as the data covered in this paper. Statistical methods have been studied to test heterogeneity between samples. The minimum combination t-test method is one of them. In this paper, we propose a maximum combinatorial t-test method that takes into account combinations that bisect data at different ratios. Also we propose a method based on the idea that examining the heterogeneity of a sample is equivalent to testing whether the number of optimal clusters is one in the cluster analysis. We verified that the proposed methods, maximum combination t-test method and gap statistic, have better type-I error and power than the previously proposed method based on simulation study and obtained the results through real data analysis.