• Title/Summary/Keyword: Traditional Statistical

Search Result 924, Processing Time 0.03 seconds

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

An Empirical Analysis on the Persistent Usage Intention of Chinese Personal Cloud Service (개인용 클라우드 서비스에 대한 중국 사용자의 지속적 사용의도에 관한 실증 연구)

  • Yu, Hexin;Sura, Suaini;Ahn, Jong-chang
    • Journal of Internet Computing and Services
    • /
    • v.16 no.3
    • /
    • pp.79-93
    • /
    • 2015
  • With the rapid development of information technology, the ways of usage have changed drastically. The ways and efficiency of traditional service application to data processing already could not satisfy the requirements of modern users. Nowadays, users have already understood the importance of data. Therefore, the processing and saving of big data have become the main research of the Internet service company. In China, with the rise and explosion of 115 Cloud leads to other technology companies have began to join the battle of cloud services market. Although currently Chinese cloud services are still mainly dominated by cloud storage service, the series of service contents based on cloud storage service have been affirmed by users, and users willing to try these new ways of services. Thus, how to let users to keep using cloud services has become a topic that worth for exploring and researching. The academia often uses the TAM model with statistical analysis to analyze and check the attitude of users in using the system. However, the basic TAM model obviously already could not satisfy the increasing scale of system. Therefore, the appropriate expansion and adjustment to the TAM model (i. e. TAM2 or TAM3) are very necessary. This study has used the status of Chinese internet users and the related researches in other areas in order to expand and improve the TAM model by adding the brand influence, hardware environment and external environments to fulfill the purpose of this study. Based on the research model, the questionnaires were developed and online survey was conducted targeting the cloud services users of four Chinese main cities. Data were obtained from 210 respondents were used for analysis to validate the research model. The analysis results show that the external factors which are service contents, and brand influence have a positive influence to perceived usefulness and perceived ease of use. However, the external factor hardware environment only has a positive influence to the factor of perceived ease of use. Furthermore, the perceived security factor that is influenced by brand influence has a positive influence persistent intention to use. Persistent intention to use also was influenced by the perceived usefulness and persistent intention to use was influenced by the perceived ease of use. Finally, this research analyzed external variables' attributes using other perspective and tried to explain the attributes. It presents Chinese cloud service users are more interested in fundamental cloud services than extended services. In private cloud services, both of increased user size and cooperation among companies are important in the study. This study presents useful opinions for the purpose of strengthening attitude for private cloud service users can use this service persistently. Overall, it can be summarized by considering the all three external factors could make Chinese users keep using the personal could services. In addition, the results of this study can provide strong references to technology companies including cloud service provider, internet service provider, and smart phone service provider which are main clients are Chinese users.

A Review of Multivariate Analysis Studies Applied for Plant Morphology in Korea (국내 식물 형태 연구에 사용된 다변량분석 논문에 대한 재고)

  • Chang, Kae Sun;Oh, Hana;Kim, Hui;Lee, Heung Soo;Chang, Chin-Sung
    • Journal of Korean Society of Forest Science
    • /
    • v.98 no.3
    • /
    • pp.215-224
    • /
    • 2009
  • A review was given of the role of traditional morphometrics in plant morphological studies using 54 published studies in three major journals and others in Korea, such as Journal of Korean Forestry Society, Korean Journal of Plant Taxonomy, Korean Journal of Breeding, Korean Journal of Apiculture, Journal of Life Science, and Korean Journal of Plant Resources from 1997 to 2008. The two most commonly used techniques of data analysis, cluster analysis (CA) and principal components analysis (PCA) with other statistical tests were discussed. The common problem of PCA is the underlying assumptions of methods, like random sampling and multivariate normal distribution of data. The procedure was intended mainly for continuous data and was not efficient for data which were not well summarized by variances or covariances. Likewise CA was most appropriate for categorical rather than continuous data. Also, the CA produced clusters whether or not natural groupings existed, and the results depended on both the similarity measure chosen and the algorithm used for clustering. An additional problems of the PCA and the CA arised with both qualitative and quantitative data with a limited number of variables and/or too few numbers of samples. Some of these problems may be avoided if a certain number of variables (more than 20 at least) and sufficient samples (40-50 at least) are considered for morphometric analyses, but we do not think that the methods are all mighty tools for data analysts. Instead, we do believe that reasonable applications combined with focus on objectives and limitations of each procedure would be a step forward.

Comparison of Brain Activation Images Associated with Sexual Arousal Induced by Visual Stimulation and SP6 Acupuncture : fMRI at 3 Tesla (시각자극과 삼음교 자침으로 유발된 성적 흥분의 대뇌 활성화 영상의 비교 : 3 테슬라 기능적 자기공명영상법)

  • Choi, Nam-Gil;Han, Jae-Bok;Jang, Seong-Joo
    • Journal of radiological science and technology
    • /
    • v.32 no.2
    • /
    • pp.183-194
    • /
    • 2009
  • Purpose : This study was performed not only to compare the brain activation regions associated with sexual arousal induced by visual stimulation and SP6 acupuncture, but also to evaluate its differential neuro-anatomical mechanism in healthy women using functional magnetic resonance imaging (fMRI) at 3 Tesla (T). Subjects and methods : A total of 21 healthy right-handed female volunteers (mean age 22 years, range 19 to 32) underwent fMRI on a 3T MR scanner. The stimulation paradigm for sexual arousal consisted of two alternating periods of rest and activation. It began with a 1-minute rest period, 3 minutes of stimulation with either of an erotic video film or SP6 acupuncture, followed by 1-minute rest. In addition, a comparative study on the brain activation patterns between an acupoint and a shampoint nearby GB37 was performed. The fMRI data were obtained from 20 slices parallel to the AC-PC line on an axial plane, giving a total of 2,000 images. The mean activation maps were constructed and analyzed by using the statistical parametric mapping (SPM99) software. Results : As comparison with the shampoint, the acupoint showed 5 times and 2 times higher activities in the neocortex and limbic system, respectively. Note that brain activation in response to stimulation with the shampoint was not observed in the regions including the HTHL in the diencephalon, GLO and AMYG in the basal ganglia, and SMG in the parietal lobe. In the comparative study of visual stimulation vs. SP6 acupuncture, the mean activation ratio of stimulus was not significantly different to each other in both the neocortex and the limbic system (p < 0.05). The mean activities induced by both stimuli were not significantly different in the neocortex, whereas the acupunctural stimulation showed higher activity in the limbic system (p < 0.05). Conclusions : This study compared the differential brain activation patterns and the neural mechanisms for sexual arousal, which were induced by visual stimulation and SP6 acupuncture by using 3T fMRI. These findings will be useful to understand the theory of traditional acupuncture and acupoint channel in scientific point of view.

  • PDF

Establishment of Miniaturized Cultivation Method for Large and Rapid Screening of High-yielding Monascus Mutants, and Enhanced Production of Monacolin-K through Statistical Optimization of Production Medium (Monascus 균사체의 소규모 배양을 통한 고생산성 균주의 대규모 선별방법 확립과 통계적 생산배지 최적화를 통한 Monacolin-K 생산성 향상)

  • Lee, Mi-Jin;Jeong, Yong-Seob;Kim, Pyeung-Hyeun;Chun, Gie-Taek
    • KSBB Journal
    • /
    • v.22 no.5
    • /
    • pp.305-312
    • /
    • 2007
  • It is crucial to develop a miniaturized cultivation method for large and rapid screening of high-yielding mutants of monacolin-K, a powerful anti-hypercholesterolemic secondary metabolite biosynthesized by the fungal cells of Monascus ruber. In order to investigate as many strains as possible in a short time, a miniaturized fermentation method especially suitable for the cultivation of the filamentous Monascus mutants was developed using $50m{\ell}$ culture-tube ($7m{\ell}$ of working volume) instead of the traditional $250m{\ell}$ flask ($50m{\ell}$ of working volume). Generally, in filamentous fungal cell fermentations, morphologies in growth and production cultures should be maintained as thick filamentous and compact-pelleted (usually less than 1 mm in diameter) forms, respectively, for enhanced production of secondary metabolites in final production cultures. In this study, we intended to induce the respective optimal morphologies in the miniaturized culture system for the purpose of rapid screening of overproducers. Miniaturized growth culture system was successfully developed due to the mass production of spores in the statistically optimized solid medium. When large amounts of spores were inoculated into the growth cultures, and brown rice flour (20 g/L) was also supplemented to the growth medium, dense filamentous morphologies were successfully induced in the growth cultures performed with the 50 ml culture tubes. It was implied that the amounts of spores inoculated into the growth tube-cultures and the growth medium components should be the key factors for the induction of the filamentous forms in the growth fermentations. Furthermore, in order to statistically optimize production medium, multiple experiments based on Plackett-Burman design and response surface method (RSM) were carried out, resulting in more than 2 fold enhanced production of monacolin-K in the final production cultures with the optimized production medium. Notably, under the production culture conditions with the statistically optimized medium, optimal pellet sizes below 1 mm in diameter were reproducibly induced, in contrast to the thick and viscous filamentous morphologies observed in the previous production cultures.

Mass Screening of Lovastatin High-yielding Mutants through Statistical Optimization of Sporulation Medium and Application of Miniaturized Fungal Cell Cultures (Lovastatin 고생산성 변이주의 신속 선별을 위해 통계적 방법을 적용한 Sporulation 배지 개발 및 Miniature 배양 방법 개발)

  • Ahn, Hyun-Jung;Jeong, Yong-Seob;Kim, Pyeung-Hyeun;Chun, Gie-Taek
    • KSBB Journal
    • /
    • v.22 no.5
    • /
    • pp.297-304
    • /
    • 2007
  • For large and rapid screening of high-yielding mutants of lovastatin produced by filamentous fungal cells of Aspergillus terreus, one of the most important stage is to test as large amounts of mutated strains as possible. For this purpose, we intended to develop a miniaturized cultivation method using $7m{\ell}$ culture tube instead of traditional $250m{\ell}$ flask (working volume $50m{\ell}$). For obtaining large amounts of conidiospores to be used as inoculums for miniaturized cultures, 4 components i.e., glucose, sucrose, yeast extract and $KH_2PO_4$ were intensively investigated, which had been observed to show positive effect on enhancement of spore production through Plackett-Burman design experimet. When optimum concentrations of these components that were determined through application of response surface method (RSM) based on central composite design (CCD) were used, maximum spore numbers amounting to $1.9\times10^{10}$ spores/plate were obtained, resulting in approximately 190 fold increase as compared to the commonly used PDA sporulation medium. Using the miniaturized cultures, intensive strain development programs were carried out for screening of lovastatin high-yielding as well as highly reproducible mutants. It was observed that, for maximum production of lovastatin, the producers should be activated through 'PaB' adaptation process during the early solid culture stage. In addition, they should be proliferated in condensed filamentous forms in miniaturized growth cultures, so that optimum amounts of highly active cells could be transferred to the production culture-tube as reproducible inoculums. Under these highly controlled fermentation conditions, compact-pelleted morphology of optimum size (less than 1 mm in diameter) was successfully induced in the miniaturized production cultures, which proved essential for maximal utilization of the producers' physiology leading to significantly enhanced production of lovastatin. As a result of continuous screening in the miniaturized cultures, lovastatin production levels of the 81% of the daughter cells derived from the high-yielding producers turned out to be in the range of 80%$\sim$120% of the lovastatin production level of the parallel flask cultures. These results demonstrate that the miniaturized cultivation method developed in this study is efficient high throughput system for large and rapid screening of highly stable and productive strains.

Quality Characteristics of Pettitoes(Jokbal) added with Coffee Meal (커피박 첨가 돈족(豚足)의 품질특성)

  • Choi, Seok-Bong;An, Sang-Ran;Lee, Myung-Ho
    • Culinary science and hospitality research
    • /
    • v.22 no.2
    • /
    • pp.115-124
    • /
    • 2016
  • The purpose of this paper is to verify improvement of the basic quality of food resources and make the pork legs as the more advanced food products after taking special processing steps with the mixture of traditional medical herbs and hot water extracted coffee meal. The pH level of the pettitoes(Jokbal) showed the highest rate among the control group but it showed no statistical differences in the moisture content between the control and the pettitoes(Jokbal) processed with coffee waste extract as an additional component. In addition, the levels of crude fat and crude ash showed slight increases as the addition of the amount increase. In case of the proteins, however, tendency of slight was decreased but it was not significantly difference as the amount increases. The sodium rate in the pettitoes(Jokbal) was higher in the additional group than in the control group. Texture analysis showed a tendency of wide decrease in the hardness and chewiness depending on amount of the added coffee waste extract. On the other hand, as for the cohesiveness and springiness, there was no significant difference with the control group. In case of the lightness value, as the amount of added coffee waste extract is increased from 10% to 20% and 30%, the 'L' value was reduced significantly compared to that of the control. And the 'a' value was not significantly different compared to the coffee waste extract foil impregnated furniture control. But the 'b' values were significantly increased in accordance with the result of increasing the amount of control is at the lowest level. The result may come from the influence of coffee waste extract, which affects the color of the pettitoes(Jokbal). According to the sensory evaluation, the pork part with 10% of coffee waste extract showed the highest score in looking, chewiness, smell and preference, resulting in the improvement in quality of the pettitoes(Jokbal).

Comparative study on effects of volume-controlled ventilation and pressure-limited ventilation for neonatal respiratory distress syndrome (신생아 호흡곤란 증후군에서 volume-controlled ventilation과 pressure-limited ventilation의 효과에 관한 비교연구)

  • Kim, Jae Jin;Hwang, Mun Jung;Lee, Sang Geel
    • Clinical and Experimental Pediatrics
    • /
    • v.53 no.1
    • /
    • pp.21-27
    • /
    • 2010
  • Purpose : In contrast with traditional time-cycled, pressure-limited ventilation, during volume-controlled ventilation, a nearly constant tidal volume is delivered with reducing volutrauma and the episodes of hypoxemia. The aim of this study was to compare the efficacy of pressure-regulated, volume controlled ventilation (PRVC) to Synchronized intermittent mandatory ventilation (SIMV) in VLBW infants with respiratory distress syndrome (RDS).Methods : 34 very low birth weight (VLBW) infants who had RDS were randomized to receive either PRVC or SIMV with surfactant administration : PRVC group (n=14) and SIMV group (n=20). We compared peak inspiratory pressure (PIP), duration of mechanical ventilation, and complications associated with ventilation, respectively with medical records. Results : There were no statistical differences in clinical characteristics between the groups. After surfactant administration, PIP was significantly lower during PRVC ventilation for 48hrs and accumulatevive value of decreased PIP was higher during PRVC ventilation for 24hrs (P<0.05). Duration of ventilation and incidence of complications was no significant difference. Conclusion : PRVC is the mode in which the smallest level of PIP required to deliver the preset tidal volume in VLBW infants with RDS, adaptively responding to compliance change in lung after surfactant replacement.

The Relationship Between Son Preference and Fertility (남아 선호와 출산력간의 관계)

  • 이성용
    • Korea journal of population studies
    • /
    • v.26 no.1
    • /
    • pp.31-57
    • /
    • 2003
  • This study is intended to examine (l)whether the value of son-for example, old age security and succession of family lineage- causing son preference in the traditional society can be explained at the individual level, (2)whether women without son in the son preference country continue her childbearing until having at least one son or give up the desire of having a son at a certain level. To accomplish these purposes, the 1974 Korean National Fertility Survey data are analyzed by the quadratic hazard models controlling unobserved heterogeneity. Unlike ordinary regression model, even omitted variables that affect hazard rates and are uncorrelated with the included independent variables can distort the parameter estimates in the hazard model. Therefore the nonparametric maximum likelihood estimator(NPMLE) of a mixing distribution developed by Heckman and Singer is used to control unobserved heterogeneity. Based on the statistical result in this study, the value of son causing son preference is determined at the societal level, not at the individual level. And Korean women without a son did not continue endlessly childbearing during child bearing ages until having a son. In general, they gave up the desire having a son when she had born six daughters continuously. Thus, 30-40 years ago, the number of daughters that women without a son giving up the desire of son was six, which is about the level of total fertility rate during 1960s. In these days, we can often see many women who have only two or three daughters and do not any son. This means that the level of giving up the desire of son, which is one factor representing the strength of son preference, becomes lower. If the strength of son preference did not become much weaker, then the fertility rates in Korea could not reach the below replacement level.

The Research about the Classification System Improvement and Cord Development of Korean Classification of Disease on Oriental Internal Medicine (한국표준질병사인분류중 한방내과영역의 분류체계 개선 및 진단명 구성에 관한 연구)

  • Lee, Won-Chul
    • The Journal of Internal Korean Medicine
    • /
    • v.31 no.1
    • /
    • pp.1-10
    • /
    • 2010
  • Objectives : It is necessary that the international classification of diseases (ICD) be examined in order to comprise the third revision of the Korean Classification of Disease on Oriental Medicine (KCD-OM) and disease classification in the oriental internal medicine field. It is essential that the selection, classification and definition of disease and pattern names of oriental concepts in internal medicine be clear. Since 2008, the fifth revision of the Korean Classification of Disease (KCD-5) has been used in Korea. It was required to use the reference classification from the Oriental medicine area based on the ICD-10. Methods : In this review, the necessity for, meaning of and content of the third revision are briefly described. The ICD system was reviewed and KCD-OM was reconstructed. How diagnosis in the oriental internal medicine area had changed is discussed. Review and Results : In 1973, the disease classification of oriental medicine was established the basis on the contents of Dongeuibogam. It was irrespective of the ICD. As to the classification system in the Oriental internal medicine field, systemic disease was comprised of wind, cold, warm, wet, dryness, heat, spirit, ki, blood, phlegm and retained fluid, consumptive disease, etc. Diseases of internal medicine comprised a system according to the five viscera and the six internal organs and followed the classification system of Dongeuibogam. The first and second revisions were of the classification system based on the curriculum in 1979 and 1995. In 1979, in the first revision, geriatric disease and idiopathic types of disease were deleted, and skin disease was included among surgery diseases. This classification was expanded to 792 small classification items and 1,535 detailed classification items to the dozen disease classes. In 1995, in the second revision, it was adjusted to 644 small classes and 1,784 detailed classification items in the dozen disease classes. KCD-OM3 did KCD from this basis. It added and comprised the oriental medical doctor's concept names of diseases considering the special conditions in Korea. KCD-OM3 examined the KCD-OMsecond revised edition (1994). It improved the duplex classification, improper classifications, etc. It is difficult for us to separate the disease names and pattern names in oriental medicine. We added to the U code and made one classification system. By considering the special conditions in Korea, 169 codes (83 disease name codes, 86 pattern name codes) became the pre-existence classification and links among 306 U codes of KCD-OM3. 137 codes were newly added in the third revision. U code added 3 domains. These are composed of the disease name (U20-U33, 97 codes), the disease pattern name (U50-U79, 191 codes) and the constitution pattern name of each disease (U95-U98, 18 codes). Conclusion : The introduction of KCD-OM3 conforms to the diagnostic system by which oriental medical doctors examine classes used with the basic structure of the reference classification of WHO and raises the clinical study and academic activity of the Korean oriental medicine and makes the production of all kinds of nation statistical indices possible. The introduction of KCD-OM3 promotes the diagnostic system by which doctors of Oriental medicine examine classes using the association with KCD-5. It will raise the smoothness and efficiency of oriental medical treatment payments in the health insurance, automobile insurance, industrial accident compensation insurance, etc. In addition, internationally, the eleventh revision work of the ICD has been initiated. It needs to consider incorporating into the International Classification of Diseases some of every country's traditional medicine.