• Title/Summary/Keyword: Combining Ability

Search Result 384, Processing Time 0.032 seconds

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

VKOSPI Forecasting and Option Trading Application Using SVM (SVM을 이용한 VKOSPI 일 중 변화 예측과 실제 옵션 매매에의 적용)

  • Ra, Yun Seon;Choi, Heung Sik;Kim, Sun Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.177-192
    • /
    • 2016
  • Machine learning is a field of artificial intelligence. It refers to an area of computer science related to providing machines the ability to perform their own data analysis, decision making and forecasting. For example, one of the representative machine learning models is artificial neural network, which is a statistical learning algorithm inspired by the neural network structure of biology. In addition, there are other machine learning models such as decision tree model, naive bayes model and SVM(support vector machine) model. Among the machine learning models, we use SVM model in this study because it is mainly used for classification and regression analysis that fits well to our study. The core principle of SVM is to find a reasonable hyperplane that distinguishes different group in the data space. Given information about the data in any two groups, the SVM model judges to which group the new data belongs based on the hyperplane obtained from the given data set. Thus, the more the amount of meaningful data, the better the machine learning ability. In recent years, many financial experts have focused on machine learning, seeing the possibility of combining with machine learning and the financial field where vast amounts of financial data exist. Machine learning techniques have been proved to be powerful in describing the non-stationary and chaotic stock price dynamics. A lot of researches have been successfully conducted on forecasting of stock prices using machine learning algorithms. Recently, financial companies have begun to provide Robo-Advisor service, a compound word of Robot and Advisor, which can perform various financial tasks through advanced algorithms using rapidly changing huge amount of data. Robo-Adviser's main task is to advise the investors about the investor's personal investment propensity and to provide the service to manage the portfolio automatically. In this study, we propose a method of forecasting the Korean volatility index, VKOSPI, using the SVM model, which is one of the machine learning methods, and applying it to real option trading to increase the trading performance. VKOSPI is a measure of the future volatility of the KOSPI 200 index based on KOSPI 200 index option prices. VKOSPI is similar to the VIX index, which is based on S&P 500 option price in the United States. The Korea Exchange(KRX) calculates and announce the real-time VKOSPI index. VKOSPI is the same as the usual volatility and affects the option prices. The direction of VKOSPI and option prices show positive relation regardless of the option type (call and put options with various striking prices). If the volatility increases, all of the call and put option premium increases because the probability of the option's exercise possibility increases. The investor can know the rising value of the option price with respect to the volatility rising value in real time through Vega, a Black-Scholes's measurement index of an option's sensitivity to changes in the volatility. Therefore, accurate forecasting of VKOSPI movements is one of the important factors that can generate profit in option trading. In this study, we verified through real option data that the accurate forecast of VKOSPI is able to make a big profit in real option trading. To the best of our knowledge, there have been no studies on the idea of predicting the direction of VKOSPI based on machine learning and introducing the idea of applying it to actual option trading. In this study predicted daily VKOSPI changes through SVM model and then made intraday option strangle position, which gives profit as option prices reduce, only when VKOSPI is expected to decline during daytime. We analyzed the results and tested whether it is applicable to real option trading based on SVM's prediction. The results showed the prediction accuracy of VKOSPI was 57.83% on average, and the number of position entry times was 43.2 times, which is less than half of the benchmark (100 times). A small number of trading is an indicator of trading efficiency. In addition, the experiment proved that the trading performance was significantly higher than the benchmark.

Studies on the Inheritance of Agronomic Characteristics in Upland Cotton Varieties (Gossypium hirsutum L.) in Korea (육지면품종의 유용형질의 유전에 관한 연구)

  • Bang-Myung Kae
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.21 no.2
    • /
    • pp.281-313
    • /
    • 1976
  • To obtain fundamental informations on cotton breeding efficiences for Korea, individual genetic relationships and interrelationships between the agronomic characteristics of Upland cotton were investigated. These experiments were couducted at the Mokpo Branch Station $(34^{\circ}48'N, $ $126^{\circ}23'E$ and altitude of 10m above sea level) from 1969 through 1972. Heterosis, combining ability, dominance and recessive gene action, genetic variance, and phenotypic and genotypic correlation were investigated by $F_1'S$ from an 11-parent partial diallel cross and the segregating $F_2$ and $F_3$ populations of the cross Paymaster times Heujueusseo Trice. The following points resulted from this study, 1. Heteroses for number of bolls per plant and lint yield were significant at 27, 84% and 37.26%, respectively. No other character had significant heteroses. 2. The GCA estimates for all studied characteristics were higher than the SCA estimates. Varieties with high GCA effects were Suwon 1 for earliness, Paymaster and Arijona for high lint percent, and Arijona for long fiber, etc, 3. SCA estimates for lint yield varied widely in crosses with Mokpo 4, Mokpo 6 and Heujueusseo Trice. Those crosses with the highest SCA effects were combinations with large characteristics differences, Example of these crosses are Mokpo 4 times Acala 1517W, Mokpo 4 times D. P. L. and Heujueusseo Trice aud Paymaster. 4. Early-maturing varieties were completely dominant to late-maturing varieties in some combinations while other crosses gave intermediate phenotypes. These results suggest additive genetic action by multi-genes. Heujueusseo Trice, Mokpo 6, and Suwon 1 showed highest degree of dominance for earliness. 5. There were no significant trends for inheritance of weight of boll and 100 seeds weight. 6. Long staple was partially to completely dominant to short staple. Though there were single gene ratios the rate of dominance decreased in the $F_2$ and $F_3$ populations in the cross between the long staple variety Paymaster and the short staple variety Heujueusseo Trice. Diallel cross $F_1$ hybrids showed complicated allelic gene action for staple length. Various dominance degree were shown by varieties. 7. Number of bolls per plant indicated strong over-dominance and small non-allelic additive gene action. 8. Lint Yield was characterized by over-dominance and by multiple non-allelic-gene action. High-yielding varieties were dominant to low-yielding ones. However, the low-yielding variety Heujueusseo Trice showed over-dominance, indicating different reactions according to the varieties and combinations. 9. Broad sense heritability for days to flowering was 34-39% while narrow sense heritability was 11%. Large variations of individual plants caused by Korean climatic conditions cause this situation. Heritability estimates for weight of boll was 30% for broad sense and 22% for narrow sense. 10. Heritability estimates for staple length and lint percent were very high suggesting strong selection effects. 11. Narrow sense heritability estimates for number of bolls per plant was 30% in the diallel cross $F_1$ hybrids and 36% in the $F_2$ population of the special cross. Broad sense heritability was estimated at 67% suggesting that. 12. Heritability estimates for lint yield was low due to high over-dominance in the diallel cross $F_1$ hybrids. Heritability estimates for yield was low in the $F_1$ hybrids but high in the $F_2$ and $F_3$ populations. 13. Phenotypic and genotypic correlations between lint percent and days to flowering and between staple length and days to flowering were high in the $F_1, $ $F_2$ and $F_3$ populations. Late-maturing varieties and individuals had long staple and high lint percent in general. As the correlation between days to flowering and lint yield was extremely low, the two traits were considered independent of each other. Days to flowering and number of bolls per plant were negatively correlated in the $F_3$ population, indicating early-maturing individual plants with many bolls may be readily selected. 14. Phenotypic and genotypic correlations between lint percent and staple length were high in $F_1, $ $F_2$ and $F_3$ populations. Accordingly, long staple varieties were high in lint percent. It was recognized that lint yield and lint percent were positively correlated in the diallel cross $F_1$ hybrids, and lint percent and staple length were positively correlated in the $F_2$ population, indicating that lint percent and staple length affect lint yield. 15. Lint yield was significantly and positively phenotypically correlated with number of bolls per plant in $F_1, $ $F_2$ and $F_3$ populations. A high genotypic correlation was also noted indicating a close genetic relationship. The selection efficiencies for a high-yielding variety can be increased when individual plants with many bolls are selected in later generations. The selection efficiencies for good fiber quality can be enhanced when individuals with long staple and high lint percent are selected in early generations.

  • PDF

Studies on Combining Ability and Inheritance of Major Agronomic Characters in Naked Barley (과맥의 주요형질에 대한 조합능력 및 유전에 관한 연구)

  • Kyung-Soo Min
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.23 no.2
    • /
    • pp.1-24
    • /
    • 1978
  • To obtain basic information on the breeding of early maturing, short culm naked-barley varieties, the following 10 varieties, Ehime # 1, Shikoku #42, Yamate hadaka, Eijo hadaka, Kagawa # 1, Jangjubaeggwa, Baegdong, Cheongmaeg, Seto-hadaka and Mokpo #42 were used in diallel crosses in 1974. Heading date, culm length and grain yield per plant for the parents, $F_1's$ and $F_2's$ of the 10X10 partial diallel crosses were measured in 1976 for analysis of their combining ability, heritability and inheritance. The results obtained are summarized below; 1. Heritabilities in broad sense for heading date, culm length and grain yield per plant were 0.7831, 0.7599 and 0.6161, respectively. Narrow sense heritabilities for heading date were 0.3972 in $F_1$ and 0.7789 in $F_2$ and for culm length 0.6567 in $F_1$ and 0.6414 in $F_2.$ These values suggest that earliness and culm length could be successfully selected for in the early generations. Narrow sense heritability for grain yield was 0.3775 in $F_1$ and 0.4170 in $F_2.$ 2. GCA effects of the $F_1$ and $F_2$ generations for days to heading were high in the early direction for early-heading varieties, while for late-heading varieties the GCA effects were high in the late direction. Absolute values for GCA effects in $F_1$ were higher than in $F_2.$ SCA effects of the $F_1$ and $F_2$ generations were high in the early-heading direction for Shikoku # 42 x Mokpo # 42, Ehime # 1 x Yamate hadaka, Shikoku # 42 x Yamate hadaka and Shikoku #42 x Eijo hadaka. 3. The GCA effects for culm length in the $F_1$ and $F_2$ generations for tall varieties were high in the tall direction while short varieties were high in the short direction. Absolute values for the GCA effects in $F_1$ were higher than in $F_2.$ SCA effects were high in the short direction for the combinations of Mokpo # 42 with Ehime # 1, Yamate had aka and Eijo hadaka. 4. The GCA effects for grain yields per plant in the $F_1$ and $F_2$ generations for varieties with high yields per plant were high in the high yielding direction, while varieties with low yields per plant were high in the low yielding direction. Absolute values of the $F_1$ GCA effects were higher than the $F_2$ effects. The combinations with high SCA effects were Mokpo # 42 x Shikoku # 42, Mokpo # 42 x Seto hadaka and Mokpo # 42 x Cheongmaeg. 5. Mean heading dates of the $F_1$ and $F_2$ generations were earlier than those of mean mid-parent. Mean heading date of the $F_1$ generation was earlier than the $F_2$ generation. Crosses involving early-heading varieties showed a greater $F_1, $ mid-parent difference than crosses involving late-heading varieties. 6. Heading date was controlled by a partial dominance effect. Nine varieties excluding Mokpo # 42 showed allelic gene action. Ehime # 1, Shikoku # 42, Kagawa # 1 and Mokpo # 42 were recessive to the other tested varieties. 7. The $F_2$ segregations of the 45 crosses for days to heading showed that 33 cosses were of such complexity that they could not be explained by simple genetic inheritance. One cross showed a 3 : 1 ratio where earliness was dominant. Another cross showed a 3 : 1 ratio where lateness was dominant. Four other crosses showed a 9 : 7 ratio for earliness while six crosses showed a 9 : 7 ratio for lateness. 8. Many transgressive segregants for earliness were found in the following crosses; Eijo hadaka x Baegdong, Ehime # 1 x Seto hadaka, Yamate had aka x Kagawa # 1, Kagawa # 1 x Sato hadaka, Shikoku # 42 x Kagawa # 1, Ehime # 1 x Kagawa # 1, Ehime # 1 x Shikoku # 42, Ehime # 1 x Eijo hadaka. 9. Mean culm length of the F, and F. generations were usually taller than the mid-parent where tall parent were used. These trends were high in the short varieties, but low in the tall varieties. 10. Culm length was controlled by partial dominace which was gonverned by allelic gene(s). Culm length showed a high degree of control by additive genes. Mokpo # 42 was recessive while Baegdong was dominant. 11. The F_2 frequency for culm length was in large part normally distributed around the midparent value. However, some combinations showed transgressive segregation for either tall or short culm length. From combinations between medium tall varieties, Ehime # 1, Shikoku # 42, Eijo hadaka and Seto hadaka, many short segregants could be found. 12. Mean grain yields per plant of the F_1 and F_2 generations were 6% and 5% higher than those of mid-parents, respectively. The varieties with high yields per plant showed a low rate of yield increase in their F_1's and F_2's while the varieties with low yields per plant showed a high rate of yield increase in their F_1's and F_1's. 13. Grain yields per plant showed over-dominnee effects, governed by non-allelic genes. Mokpo # 42 showed recessive genetic control of grain yield per plant. It remains difficult to clarify the inheritance of grain yields per plant from these data.

  • PDF

A Study on improvement of curriculum in Nursing (간호학 교과과정 개선을 위한 조사 연구)

  • 김애실
    • Journal of Korean Academy of Nursing
    • /
    • v.4 no.2
    • /
    • pp.1-16
    • /
    • 1974
  • This Study involved the development of a survey form and the collection of data in an effort-to provide information which can be used in the improvement of nursing curricula. The data examined were the kinds courses currently being taught in the curricula of nursing education institutions throughout Korea, credits required for course completion, and year in-which courses are taken. For the purposes of this study, curricula were classified into college, nursing school and vocational school categories. Courses were directed into the 3 major categories of general education courses, supporting science courses and professional education course, and further subdirector as. follows: 1) General education (following the classification of Philip H. phoenix): a) Symbolics, b) Empirics, c) Aesthetics. 4) Synthetics, e) Ethics, f) Synoptic. 2) Supporting science: a) physical science, b) biological science, c) social science, d) behavioral science, e) Health science, f) Educations 3) Professional Education; a) basic courses, b) courses in each of the respective fields of nursing. Ⅰ. General Education aimed at developing the individual as a person and as a member of society is relatively strong in college curricula compared with the other two. a) Courses included in the category of symbolics included Korean language, English, German. Chines. Mathematics. Statics: Economics and Computer most college curricula included 20 credits. of courses in this sub-category, while nursing schools required 12 credits and vocational school 10 units. English ordinarily receives particularly heavy emphasis. b) Research methodology, Domestic affair and women & courtney was included under the category of empirics in the college curricula, nursing and vocational school do not offer this at all. c) Courses classified under aesthetics were physical education, drill, music, recreation and fine arts. Most college curricula had 4 credits in these areas, nursing school provided for 2 credits, and most vocational schools offered 10 units. d) Synoptic included leadership, interpersonal relationship, and communications, Most schools did not offer courses of this nature. e) The category of ethics included citizenship. 2 credits are provided in college curricula, while vocational schools require 4 units. Nursing schools do not offer these courses. f) Courses included under synoptic were Korean history, cultural history, philosophy, Logics, and religion. Most college curricular 5 credits in these areas, nursing schools 4 credits. and vocational schools 2 units. g) Only physical education was given every Year in college curricula and only English was given in nursing schools and vocational schools in every of the curriculum. Most of the other courses were given during the first year of the curriculum. Ⅱ. Supporting science courses are fundamental to the practice and application of nursing theory. a) Physical science course include physics, chemistry and natural science. most colleges and nursing schools provided for 2 credits of physical science courses in their curricula, while most vocational schools did not offer t me. b) Courses included under biological science were anatomy, physiologic, biology and biochemistry. Most college curricula provided for 15 credits of biological science, nursing schools for the most part provided for 11 credits, and most vocational schools provided for 8 units. c) Courses included under social science were sociology and anthropology. Most colleges provided for 1 credit in courses of this category, which most nursing schools provided for 2 creates Most vocational school did not provide courses of this type. d) Courses included under behavioral science were general and clinical psychology, developmental psychology. mental hygiene and guidance. Most schools did not provide for these courses. e) Courses included under health science included pharmacy and pharmacology, microbiology, pathology, nutrition and dietetics, parasitology, and Chinese medicine. Most college curricula provided for 11 credits, while most nursing schools provide for 12 credits, most part provided 20 units of medical courses. f) Courses included under education included educational psychology, principles of education, philosophy of education, history of education, social education, educational evaluation, educational curricula, class management, guidance techniques and school & community. Host college softer 3 credits in courses in this category, while nursing schools provide 8 credits and vocational schools provide for 6 units, 50% of the colleges prepare these students to qualify as regular teachers of the second level, while 91% of the nursing schools and 60% of the vocational schools prepare their of the vocational schools prepare their students to qualify as school nurse. g) The majority of colleges start supporting science courses in the first year and complete them by the second year. Nursing schools and vocational schools usually complete them in the first year. Ⅲ. Professional Education courses are designed to develop professional nursing knowledge, attitudes and skills in the students. a) Basic courses include social nursing, nursing ethics, history of nursing professional control, nursing administration, social medicine, social welfare, introductory nursing, advanced nursing, medical regulations, efficient nursing, nursing english and basic nursing, College curricula devoted 13 credits to these subjects, nursing schools 14 credits, and vocational schools 26 units indicating a severe difference in the scope of education provided. b) There was noticeable tendency for the colleges to take a unified approach to the branches of nursing. 60% of the schools had courses in public health nursing, 80% in pediatric nursing, 60% in obstetric nursing, 90% in psychiatric nursing and 80% in medical-surgical nursing. The greatest number of schools provided 48 crudites in all of these fields combined. in most of the nursing schools, 52 credits were provided for courses divided according to disease. in the vocational schools, unified courses are provided in public health nursing, child nursing, maternal nursing, psychiatric nursing and adult nursing. In addition, one unit is provided for one hour a week of practice. The total number of units provided in the greatest number of vocational schools is thus Ⅲ units double the number provided in nursing schools and colleges. c) In th leges, the second year is devoted mainly to basic nursing courses, while the third and fourth years are used for advanced nursing courses. In nursing schools and vocational schools, the first year deals primarily with basic nursing and the second and third years are used to cover advanced nursing courses. The study yielded the following conclusions. 1. Instructional goals should be established for each courses in line with the idea of nursing, and curriculum improvements should be made accordingly. 2. Course that fall under the synthetics category should be strengthened and ways should be sought to develop the ability to cooperate with those who work for human welfare and health. 3. The ability to solve problems on the basis of scientific principles and knowledge and understanding of man society should be fostered through a strengthening of courses dealing with physical sciences, social sciences and behavioral sciences and redistribution of courses emphasizing biological and health sciences. 4. There should be more balanced curricula with less emphasis on courses in the major There is a need to establish courses necessary for the individual nurse by doing away with courses centered around specific diseases and combining them in unified courses. In addition it is possible to develop skill in dealing with people by using the social setting in comprehensive training. The most efficient ratio of the study experience should be studied to provide more effective, interesting education Elective course should be initiated to insure a man flexible, responsive educational program. 5. The curriculum stipulated in the education law should be examined.

  • PDF

Bankruptcy Forecasting Model using AdaBoost: A Focus on Construction Companies (적응형 부스팅을 이용한 파산 예측 모형: 건설업을 중심으로)

  • Heo, Junyoung;Yang, Jin Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.35-48
    • /
    • 2014
  • According to the 2013 construction market outlook report, the liquidation of construction companies is expected to continue due to the ongoing residential construction recession. Bankruptcies of construction companies have a greater social impact compared to other industries. However, due to the different nature of the capital structure and debt-to-equity ratio, it is more difficult to forecast construction companies' bankruptcies than that of companies in other industries. The construction industry operates on greater leverage, with high debt-to-equity ratios, and project cash flow focused on the second half. The economic cycle greatly influences construction companies. Therefore, downturns tend to rapidly increase the bankruptcy rates of construction companies. High leverage, coupled with increased bankruptcy rates, could lead to greater burdens on banks providing loans to construction companies. Nevertheless, the bankruptcy prediction model concentrated mainly on financial institutions, with rare construction-specific studies. The bankruptcy prediction model based on corporate finance data has been studied for some time in various ways. However, the model is intended for all companies in general, and it may not be appropriate for forecasting bankruptcies of construction companies, who typically have high liquidity risks. The construction industry is capital-intensive, operates on long timelines with large-scale investment projects, and has comparatively longer payback periods than in other industries. With its unique capital structure, it can be difficult to apply a model used to judge the financial risk of companies in general to those in the construction industry. Diverse studies of bankruptcy forecasting models based on a company's financial statements have been conducted for many years. The subjects of the model, however, were general firms, and the models may not be proper for accurately forecasting companies with disproportionately large liquidity risks, such as construction companies. The construction industry is capital-intensive, requiring significant investments in long-term projects, therefore to realize returns from the investment. The unique capital structure means that the same criteria used for other industries cannot be applied to effectively evaluate financial risk for construction firms. Altman Z-score was first published in 1968, and is commonly used as a bankruptcy forecasting model. It forecasts the likelihood of a company going bankrupt by using a simple formula, classifying the results into three categories, and evaluating the corporate status as dangerous, moderate, or safe. When a company falls into the "dangerous" category, it has a high likelihood of bankruptcy within two years, while those in the "safe" category have a low likelihood of bankruptcy. For companies in the "moderate" category, it is difficult to forecast the risk. Many of the construction firm cases in this study fell in the "moderate" category, which made it difficult to forecast their risk. Along with the development of machine learning using computers, recent studies of corporate bankruptcy forecasting have used this technology. Pattern recognition, a representative application area in machine learning, is applied to forecasting corporate bankruptcy, with patterns analyzed based on a company's financial information, and then judged as to whether the pattern belongs to the bankruptcy risk group or the safe group. The representative machine learning models previously used in bankruptcy forecasting are Artificial Neural Networks, Adaptive Boosting (AdaBoost) and, the Support Vector Machine (SVM). There are also many hybrid studies combining these models. Existing studies using the traditional Z-Score technique or bankruptcy prediction using machine learning focus on companies in non-specific industries. Therefore, the industry-specific characteristics of companies are not considered. In this paper, we confirm that adaptive boosting (AdaBoost) is the most appropriate forecasting model for construction companies by based on company size. We classified construction companies into three groups - large, medium, and small based on the company's capital. We analyzed the predictive ability of AdaBoost for each group of companies. The experimental results showed that AdaBoost has more predictive ability than the other models, especially for the group of large companies with capital of more than 50 billion won.

Diallel Analysis of Anatomical Components of the Fruit in Red Pepper (이면교잡(二面交雜)에 의(依)한 고추과중(果重)의 구성요소(構成要素)에 대(對)한 유전분석(遺傳分析))

  • Kim, Yang Choon
    • Current Research on Agriculture and Life Sciences
    • /
    • v.1
    • /
    • pp.11-18
    • /
    • 1983
  • This study was performed to obtain the basic informations for red dry pepper fruit with more pericarp weight(or in percentage) with a complete diallel cross(excluding reciprocals) using eight cultivars. Heterosis, combining ability and inheritance of the dry red fruit weight and its components(stem, placenta, seed, and pericarp) were evaluated. The results obtained were summarized as follows : Dry weight/fruit and its four antomical components were heavier in the earlier harvest fruit than in that of the later fruit. They showed 1% significance among parents and $F_1s$, and those of $F_1$ were significantly heavier than in parent. All characters in earlier fruit of parent, however, were higher than in later fruit of $F_1$. Dry weight percentage of pericarp to dry weight/fruit was highest followed by seed. Percentage of pericarp in the later fruit was increased while the seed decreased and percentages of stem and placenta were not differed between the earlier and later fruit. $F_1$ hybrids above the higher parent were observed in all characters. Mean heterosis (%) was positive in all characters while mean heterobeltiosis (%) was negative excepting seed and dry weight/fruit. GCA and SCA variances were highly significant, and GCA vaiances were greater than SCA in all characters. The directions of dominance were positive. Partial dominance was shown in stem, complete dominance in placenta, pericarp and dry weight/fruit, and over dominance in seed. The effective genes were estimated as one for stem and placenta, and two for seed, pericarp and dry weight/fruit. Heritabilities in narrow and broad sense were higher.

  • PDF

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

Establishing a Nomogram for Stage IA-IIB Cervical Cancer Patients after Complete Resection

  • Zhou, Hang;Li, Xiong;Zhang, Yuan;Jia, Yao;Hu, Ting;Yang, Ru;Huang, Ke-Cheng;Chen, Zhi-Lan;Wang, Shao-Shuai;Tang, Fang-Xu;Zhou, Jin;Chen, Yi-Le;Wu, Li;Han, Xiao-Bing;Lin, Zhong-Qiu;Lu, Xiao-Mei;Xing, Hui;Qu, Peng-Peng;Cai, Hong-Bing;Song, Xiao-Jie;Tian, Xiao-Yu;Zhang, Qing-Hua;Shen, Jian;Liu, Dan;Wang, Ze-Hua;Xu, Hong-Bing;Wang, Chang-Yu;Xi, Ling;Deng, Dong-Rui;Wang, Hui;Lv, Wei-Guo;Shen, Keng;Wang, Shi-Xuan;Xie, Xing;Cheng, Xiao-Dong;Ma, Ding;Li, Shuang
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.9
    • /
    • pp.3773-3777
    • /
    • 2015
  • Background: This study aimed to establish a nomogram by combining clinicopathologic factors with overall survival of stage IA-IIB cervical cancer patients after complete resection with pelvic lymphadenectomy. Materials and Methods: This nomogram was based on a retrospective study on 1,563 stage IA-IIB cervical cancer patients who underwent complete resection and lymphadenectomy from 2002 to 2008. The nomogram was constructed based on multivariate analysis using Cox proportional hazard regression. The accuracy and discriminative ability of the nomogram were measured by concordance index (C-index) and calibration curve. Results: Multivariate analysis identified lymph node metastasis (LNM), lymph-vascular space invasion (LVSI), stromal invasion, parametrial invasion, tumor diameter and histology as independent prognostic factors associated with cervical cancer survival. These factors were selected for construction of the nomogram. The C-index of the nomogram was 0.71 (95% CI, 0.65 to 0.77), and calibration of the nomogram showed good agreement between the 5-year predicted survival and the actual observation. Conclusions: We developed a nomogram predicting 5-year overall survival of surgically treated stage IA-IIB cervical cancer patients. More comprehensive information that is provided by this nomogram could provide further insight into personalized therapy selection.

Utilization of a Ubiquitous Environmental Sculptures Analysis (유비쿼터스 환경 조형물의 이용의식 실태 분석)

  • Kim, Dong-Chan;Cho, Hwee-In
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.38 no.3
    • /
    • pp.15-22
    • /
    • 2010
  • Today's rapid shifts toward a new paradigm are combining city spaces with reality and technology, which is known as a ubiquitous environment. An ubiquitous environment means that 'whenever' and 'wherever' become connected. It is a great possibility that this will change our future lifestyle. Korea has the biggest advantage in the implementation of this new environment, such as having an excellent network infrastructure. Using these attributes of a ubiquitous environment, changes are being made toward ubiquitous cities within developing fields of construction, landscaping, streets, art, and the environment. This research is based on background of research that activated media pole in public city space has been done research about reality of digital skill, fusion, and sense of ubitizen, and Kang-Nam U-street applied by ubiquitous technique. While reflecting an environment that can be utilized in a modern digital society, the application of ubiquitous technology to media pole can be a space for the two-way communication of the current paradigm. It would also be meaningful to create a new cultural space through media pole. Through evaluation, citizens of the ubiquitous age are going to interact to raise the satisfaction that media pole in city space can prevent giving direction to develop and trial and error about service ability, identity, and publicity. Finally, the media pole can be used as a fundamental element to suggest directions for change when viewed as future development.