• Title/Summary/Keyword: minimal training sample

Search Result 5, Processing Time 0.021 seconds

Bayesian Testing for the Equality of K-Exponential Populations (K개 지수분포의 상등에 관한 베이지안 다중검정)

  • Moon, Kyoung-Ae;Kim, Dal-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.1
    • /
    • pp.41-50
    • /
    • 2001
  • We propose the Bayesian testing for the equality of K-exponential populations means. Specially we use the intrinsic Bayesian factors suggested by Beregr and Perrichi (1996,1998) based on the noninformative priors for the parameters. And, we investigate the usefulness of the proposed Bayesian testing procedures via simulations.

  • PDF

Bayesian Testing for the Equality of Two Lognormal Populations with the fractional Bayes factor (부분 베이즈요인을 이용한 로그정규분포의 상등에 관한 베이지안검정)

  • Moon, Kyoung-Ae;Kim, Dal-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.1
    • /
    • pp.51-59
    • /
    • 2001
  • We propose the Bayesian testing for the equality of two Lognormal population means. Specially we use the fractional Bayesian factors suggested by O'Hagan (1995) based on the noninformative priors for the parameters. In order to investigate the usefulness of the proposed Bayesian testing procedures, we compare it with classical tests via both real data analysis and simulations.

  • PDF

Comparison the Diagnostic Value of Dilatation and Curettage Versus Endometrial Biopsy by Pipelle - a Clinical Trial

  • Sanam, Moradan;Majid, Mir Mohammad Khani
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.12
    • /
    • pp.4971-4975
    • /
    • 2015
  • Background: Several methods have been presented for the evaluation of the endometrium in patients with abnormal uterine bleeding, which include minimal invasive and invasive approaches such as diagnostic curettage or endometrial biopsy by Pipelle. Many studies have been performed in order to compare two methods; diagnostic curettage and outpatient endometrial biopsy. This investigation compared sampling adequacy, endometrial histopathology, failure rates, duration and costs between diagnostic curettage in a hospital and endometrial biopsy. Materials and Methods: This single blind clinical trial was performed on 130 patients older than 35 years who was referred to Amir training hospital in 2013 for elective diagnostic curettage because of abnormal uterine bleeding. For all patients eligible for the study, an endometrial sample by Pipelle was taken without anesthesia or dilatation. Then under general anesthesia diagnostic curettage was performed by sharp curette. Sampling duration was calculated and both samples were sent to the same pathologist. The diagnostic values of two methods in the diagnosis of normal endometrium, endometrial hyperplasia and carcinoma were compared. The costs of these two methods were also compared. Data analysis was performed by SPSS (version 16.0) software. Chi-Square, Fisher, and Pearson tests were used and were considered statistically significant at P values less than 0.05. Results: Two methods were agreed upon 88% of sampling adequacy and 94% of pathological results. Specificity of 100% and sensitivity of 90% for detection of proliferative endometrium, secretory endometrium, simple hyperplasia without atypia and 100% for cancer were recorded. Pipelle diagnostic accuracy in comparison with curettage, have been reported over 97%, so the failure rate in this study was below 5%. Sensitivity of Pipelle for detection of atrophic endometrium was reported below 50%. Duration and cost was lower in Pipelle versus curettage. Conclusions: It is concluded that due to high agreement and cohesion coefficient between curettage and Pipelle on the issue of sampling adequacy, histopathology finding (except atrophic endometrium), low failure rate, duration of sampling and cost, Pipelle can be introduced as a suitable alternative of diagnostic curettage.

Recognition of Superimposed Patterns with Selective Attention based on SVM (SVM기반의 선택적 주의집중을 이용한 중첩 패턴 인식)

  • Bae, Kyu-Chan;Park, Hyung-Min;Oh, Sang-Hoon;Choi, Youg-Sun;Lee, Soo-Young
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.5 s.305
    • /
    • pp.123-136
    • /
    • 2005
  • We propose a recognition system for superimposed patterns based on selective attention model and SVM which produces better performance than artificial neural network. The proposed selective attention model includes attention layer prior to SVM which affects SVM's input parameters. It also behaves as selective filter. The philosophy behind selective attention model is to find the stopping criteria to stop training and also defines the confidence measure of the selective attention's outcome. Support vector represents the other surrounding sample vectors. The support vector closest to the initial input vector in consideration is chosen. Minimal euclidean distance between the modified input vector based on selective attention and the chosen support vector defines the stopping criteria. It is difficult to define the confidence measure of selective attention if we apply common selective attention model, A new way of doffing the confidence measure can be set under the constraint that each modified input pixel does not cross over the boundary of original input pixel, thus the range of applicable information get increased. This method uses the following information; the Euclidean distance between an input pattern and modified pattern, the output of SVM, the support vector output of hidden neuron that is the closest to the initial input pattern. For the recognition experiment, 45 different combinations of USPS digit data are used. Better recognition performance is seen when selective attention is applied along with SVM than SVM only. Also, the proposed selective attention shows better performance than common selective attention.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.