• Title/Summary/Keyword: information application

Search Result 20,463, Processing Time 0.05 seconds

An Intelligence Support System Research on KTX Rolling Stock Failure Using Case-based Reasoning and Text Mining (사례기반추론과 텍스트마이닝 기법을 활용한 KTX 차량고장 지능형 조치지원시스템 연구)

  • Lee, Hyung Il;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.47-73
    • /
    • 2020
  • KTX rolling stocks are a system consisting of several machines, electrical devices, and components. The maintenance of the rolling stocks requires considerable expertise and experience of maintenance workers. In the event of a rolling stock failure, the knowledge and experience of the maintainer will result in a difference in the quality of the time and work to solve the problem. So, the resulting availability of the vehicle will vary. Although problem solving is generally based on fault manuals, experienced and skilled professionals can quickly diagnose and take actions by applying personal know-how. Since this knowledge exists in a tacit form, it is difficult to pass it on completely to a successor, and there have been studies that have developed a case-based rolling stock expert system to turn it into a data-driven one. Nonetheless, research on the most commonly used KTX rolling stock on the main-line or the development of a system that extracts text meanings and searches for similar cases is still lacking. Therefore, this study proposes an intelligence supporting system that provides an action guide for emerging failures by using the know-how of these rolling stocks maintenance experts as an example of problem solving. For this purpose, the case base was constructed by collecting the rolling stocks failure data generated from 2015 to 2017, and the integrated dictionary was constructed separately through the case base to include the essential terminology and failure codes in consideration of the specialty of the railway rolling stock sector. Based on a deployed case base, a new failure was retrieved from past cases and the top three most similar failure cases were extracted to propose the actual actions of these cases as a diagnostic guide. In this study, various dimensionality reduction measures were applied to calculate similarity by taking into account the meaningful relationship of failure details in order to compensate for the limitations of the method of searching cases by keyword matching in rolling stock failure expert system studies using case-based reasoning in the precedent case-based expert system studies, and their usefulness was verified through experiments. Among the various dimensionality reduction techniques, similar cases were retrieved by applying three algorithms: Non-negative Matrix Factorization(NMF), Latent Semantic Analysis(LSA), and Doc2Vec to extract the characteristics of the failure and measure the cosine distance between the vectors. The precision, recall, and F-measure methods were used to assess the performance of the proposed actions. To compare the performance of dimensionality reduction techniques, the analysis of variance confirmed that the performance differences of the five algorithms were statistically significant, with a comparison between the algorithm that randomly extracts failure cases with identical failure codes and the algorithm that applies cosine similarity directly based on words. In addition, optimal techniques were derived for practical application by verifying differences in performance depending on the number of dimensions for dimensionality reduction. The analysis showed that the performance of the cosine similarity was higher than that of the dimension using Non-negative Matrix Factorization(NMF) and Latent Semantic Analysis(LSA) and the performance of algorithm using Doc2Vec was the highest. Furthermore, in terms of dimensionality reduction techniques, the larger the number of dimensions at the appropriate level, the better the performance was found. Through this study, we confirmed the usefulness of effective methods of extracting characteristics of data and converting unstructured data when applying case-based reasoning based on which most of the attributes are texted in the special field of KTX rolling stock. Text mining is a trend where studies are being conducted for use in many areas, but studies using such text data are still lacking in an environment where there are a number of specialized terms and limited access to data, such as the one we want to use in this study. In this regard, it is significant that the study first presented an intelligent diagnostic system that suggested action by searching for a case by applying text mining techniques to extract the characteristics of the failure to complement keyword-based case searches. It is expected that this will provide implications as basic study for developing diagnostic systems that can be used immediately on the site.

A Study on Chinese Traditional Auspicious Fish Pattern Application in Corperate Identity Design (중국 전통 길상 어(魚)문양을 응용한 중국 기업의 아이덴티티 디자인 동향)

  • ZHANG, JINGQIU
    • Cartoon and Animation Studies
    • /
    • s.50
    • /
    • pp.349-382
    • /
    • 2018
  • China is a great civilization which is a combination of various ethnic groups with long history change. As one of these important components of traditional culture, the lucky shape has been going through the ideological upheaval of the history change of China. Up to now, it has become the important parts which can stimulate the emotion of Chinese nation. The lucky shape becomes the basis of the rich traditional culture by long history of the Chinese nation. Even say it is the centre of this traditional culture resource. The lucky shape is a way of expressing the Chinese history and national emotions. It is the important part of people's living habits, emotion, as well as the cultural background. What's more, it has the value of beliefs of Surname totem. Meanwhile, it also has the function of passing on information. The symbol of information finally was created by the being of lucky shape to indicate its conceptual content. There are various kinds of lucky shapes. It will have its limitations when researching all kinds of them professionally. So, here the lucky shape of FISH will be researched. The shape of fish is the first good shape created by the Chinese nation. It is about 6000 years. Its special shape and lucky meaning embody the peculiar inherent culture and intension of the Chinese nation. It's the important component of the Chinese traditional culture. The traditional shape of fish was focused on the continuation of history and the patterns recognition, etc. It seldom indicated the meaning of the shape into the using of the modern design. So by searching the lucky meaning & the way of fish shape, the purpose of the search is to explore the real analysis of value of the fish shape in the modern enterprise identity design. The way of search is through the development of the history, the evolvement and the meaning of lucky of the traditional fish shape to analyse the symbolic meaning and the cultural meaning from all levels in nation, culture, art and life, etc. And by using the huge living example of the enterprise identity design of the traditional shape of the fish to analyse that how it works in positive way by those enterprise which is based on the trust with good image. In the modern Chinese enterprise identity design, the lucky image will be reinterpreted in the modern way. It will be proofed by the national perceptual knowledge of the consumer and the way of enlarge the goodwill of corporate image. It will be the conclusion. The traditional fish shape is the important core of modern design.So this search is taken through the instance of the design of enterprise image of the traditional fish shape to analysis the idea of the majority Chinese people of the traditional luck and the influence of corporation which based on trust and credibility. In modern image design of Chinese corporation, the auspicious sign reappear. The question survey is taken by people through the perceptual knowledge of the consumer and the cognition the enterprise image. According the result, people can speculate the improvement of consumer's recognition and the possibility of development of traditional concept.

Development and application of prediction model of hyperlipidemia using SVM and meta-learning algorithm (SVM과 meta-learning algorithm을 이용한 고지혈증 유병 예측모형 개발과 활용)

  • Lee, Seulki;Shin, Taeksoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.111-124
    • /
    • 2018
  • This study aims to develop a classification model for predicting the occurrence of hyperlipidemia, one of the chronic diseases. Prior studies applying data mining techniques for predicting disease can be classified into a model design study for predicting cardiovascular disease and a study comparing disease prediction research results. In the case of foreign literatures, studies predicting cardiovascular disease were predominant in predicting disease using data mining techniques. Although domestic studies were not much different from those of foreign countries, studies focusing on hypertension and diabetes were mainly conducted. Since hypertension and diabetes as well as chronic diseases, hyperlipidemia, are also of high importance, this study selected hyperlipidemia as the disease to be analyzed. We also developed a model for predicting hyperlipidemia using SVM and meta learning algorithms, which are already known to have excellent predictive power. In order to achieve the purpose of this study, we used data set from Korea Health Panel 2012. The Korean Health Panel produces basic data on the level of health expenditure, health level and health behavior, and has conducted an annual survey since 2008. In this study, 1,088 patients with hyperlipidemia were randomly selected from the hospitalized, outpatient, emergency, and chronic disease data of the Korean Health Panel in 2012, and 1,088 nonpatients were also randomly extracted. A total of 2,176 people were selected for the study. Three methods were used to select input variables for predicting hyperlipidemia. First, stepwise method was performed using logistic regression. Among the 17 variables, the categorical variables(except for length of smoking) are expressed as dummy variables, which are assumed to be separate variables on the basis of the reference group, and these variables were analyzed. Six variables (age, BMI, education level, marital status, smoking status, gender) excluding income level and smoking period were selected based on significance level 0.1. Second, C4.5 as a decision tree algorithm is used. The significant input variables were age, smoking status, and education level. Finally, C4.5 as a decision tree algorithm is used. In SVM, the input variables selected by genetic algorithms consisted of 6 variables such as age, marital status, education level, economic activity, smoking period, and physical activity status, and the input variables selected by genetic algorithms in artificial neural network consist of 3 variables such as age, marital status, and education level. Based on the selected parameters, we compared SVM, meta learning algorithm and other prediction models for hyperlipidemia patients, and compared the classification performances using TP rate and precision. The main results of the analysis are as follows. First, the accuracy of the SVM was 88.4% and the accuracy of the artificial neural network was 86.7%. Second, the accuracy of classification models using the selected input variables through stepwise method was slightly higher than that of classification models using the whole variables. Third, the precision of artificial neural network was higher than that of SVM when only three variables as input variables were selected by decision trees. As a result of classification models based on the input variables selected through the genetic algorithm, classification accuracy of SVM was 88.5% and that of artificial neural network was 87.9%. Finally, this study indicated that stacking as the meta learning algorithm proposed in this study, has the best performance when it uses the predicted outputs of SVM and MLP as input variables of SVM, which is a meta classifier. The purpose of this study was to predict hyperlipidemia, one of the representative chronic diseases. To do this, we used SVM and meta-learning algorithms, which is known to have high accuracy. As a result, the accuracy of classification of hyperlipidemia in the stacking as a meta learner was higher than other meta-learning algorithms. However, the predictive performance of the meta-learning algorithm proposed in this study is the same as that of SVM with the best performance (88.6%) among the single models. The limitations of this study are as follows. First, various variable selection methods were tried, but most variables used in the study were categorical dummy variables. In the case with a large number of categorical variables, the results may be different if continuous variables are used because the model can be better suited to categorical variables such as decision trees than general models such as neural networks. Despite these limitations, this study has significance in predicting hyperlipidemia with hybrid models such as met learning algorithms which have not been studied previously. It can be said that the result of improving the model accuracy by applying various variable selection techniques is meaningful. In addition, it is expected that our proposed model will be effective for the prevention and management of hyperlipidemia.

Actual Conditions and Perception of Safety Accidents by School Foodservice Employees in Chungbuk (충북지역 학교급식 조리종사원의 안전사고 실태 및 인식)

  • Cho, Hyun A;Lee, Young Eun;Park, Eun Hye
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.43 no.10
    • /
    • pp.1594-1606
    • /
    • 2014
  • The purpose of this study was to examine safety accidents related to school foodservice, working and operating environments of school foodservice, status and awareness of safety education, educational needs, and information on qualitative improvement of school foodservice. The subjects in this study were 234 cooks in charge of cooking at elementary and secondary schools in Chungbuk. A survey was conducted from July 30 to August 8, 2012, and among 202 questionnaires gathered, 194 completed questionnaires were analyzed. Statistical analyses were performed on data utilizing the SPSS version 19.0. The main results of this study were as follows: 44.3% of workers experienced safety accidents. The most frequent safety accident was 'once' (60.5%), and most safety accidents took place between June and August (31.4%). The time at which most safety accidents happened was between 8 and 11 am. Most safety accidents happened during cooking (52.3%) and while using a soup pot or frying pot (52.4%). The most common accidents were 'burns', 'wrist and arm pain', and 'slips and falls'. Respondents who experienced safety accidents replied that 57.6% of employees dealt with injuries at their own expense, and only 35.3% utilized industrial accident insurance. In terms of the operating environment, the score for 'offering information and application' was highest (3.76 points), whereas that for 'security of budget' was lowest (1.77 points). As for accident education, employees received safety education approximately 3.45 times and 5.10 hours per year. Improving the working environment of school foodservice cooks requires administrative and financial support. Furthermore, educational materials and guidelines based on the working environment and safety accident status of school foodservice cooks are required in order to minimize potential risk factors and control safety accidents in school foodservice.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

A Study on Legal and Institutional Improvement Measures for the Effective Implementation of SMS -Focusing on Aircraft Accident Investigation-

  • Yoo, Kyung-In
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.32 no.2
    • /
    • pp.101-127
    • /
    • 2017
  • Even with the most advanced aviation technology benefits, aircraft accidents are constantly occurring while air passenger transportation volume is expected to double in the next 15 years. Since it is not possible to secure aviation safety only by the post aircraft accident safety action of accident investigations, it has been recognized and consensus has been formed that proactive and predictive prevention measures are necessary. In this sense, the aviation safety management system (SMS) was introduced in 2008 and has been carried out in earnest since 2011. SMS is a proactive and predictive aircraft accident preventive measure, which is a mechanism to eliminate the fundamental risk factors by approaching organizational factors beyond technological factors and human factors related to aviation safety. The methodology is to collect hazards in all the sites required for aircraft operations, to build a database, to analyze the risks, and through managing risks, to keep the risks acceptable or below. Therefore, the improper implementation of SMS indicates that the aircraft accident prevention is insufficient and it is to be directly connected with the aircraft accident. Reports of duty performance related hazards including their own errors are essential and most important in SMS. Under the policy of just culture for voluntary reporting, the guarantee of information providers' anonymity, non-punishment and non-blame should be basically secured, but to this end, under-reporting is stagnant due to lack of trust in their own organizations. It is necessary for the accountable executive(CEO) and senior management to take a leading role to foster the safety culture initiating from just culture with the safety consciousness, balancing between safety and profit for the organization. Though a Ministry of Land, Infrastructure and Transport's order, "Guidance on SMS Implementation" states the training required for the accountable executive(CEO) and senior management, it is not legally binding. Thus it is suggested that the SMS training completion certificates of accountable executive(CEO) and senior management be included in SMS approval application form that is legally required by "Korea Aviation Safety Program" in addition to other required documents such as a copy of SMS manual. Also, SMS related items are missing in the aircraft accident investigation, so that organizational factors in association with safety culture and risk management are not being investigated. This hinders from preventing future accidents, as the root cause cannot be identified. The Aircraft Accident Investigation Manuals issued by ICAO contain the SMS investigation wheres it is not included in the final report form of Annex 13 to the Convention on International Civil Aviation. In addition, the US National Transportation Safety Board(NTSB) that has been a substantial example of the aircraft accident investigation for the other accident investigation agencies worldwide does not appear to expand the scope of investigation activities further to SMS. For these reasons, it is believed that investigation agencies conducting their investigations under Annex 13 do not include SMS in the investigation items, and the aircraft accident investigators are hardly exposed to SMS investigation methods or techniques. In this respect, it is necessary to include the SMS investigation in the organization and management information of the final report format of Annex 13. In Korea as well, in the same manner, SMS item should be added to the final report format of the Operating Regulation of the Aircraft and Railway Accident Investigation Board. If such legal and institutional improvement methods are complemented, SMS will serve the purpose of aircraft accident prevention effectively and contribute to the improvement of aviation safety in the future.

  • PDF

Clinical Applications and Efficacy of Korean Ginseng (고려인삼의 주요 효능과 그 임상적 응용)

  • Nam, Ki-Yeul
    • Journal of Ginseng Research
    • /
    • v.26 no.3
    • /
    • pp.111-131
    • /
    • 2002
  • Korean ginseng (Panax ginseng C.A. Meyer) received a great deal of attention from the Orient and West as a tonic agent, health food and/or alternative herbal therapeutic agent. However, controversy with respect to scientific evidence on pharmacological effects especially, evaluation of clinical efficacy and the methodological approach still remains to be solved. Author reviewed those articles published since 1980 when pharmacodynamic studies on ginseng have intensively started. Special concern was paid on metabolic disorders including diabetes mellitus, circulatory disorders, malignant tumor, sexual dysfunction, and physical and mental performance to give clear information to those who are interested in pharmacological study of ginseng and to promote its clinical use. With respect to chronic diseases such as diabetes mellitus, atherosclerosis, high blood pressure, malignant disorders, and sexual disorders, it seems that ginseng plays preventive and restorative role rather than therapeutics. Particularly, ginseng plays a significant role in ameliorating subjective symptoms and preventing quality of life from deteriorating by long term exposure of chemical therapeutic agents. Also it seems that the potency of ginseng is mild, therefore it could be more effective when used concomitantly with conventional therapy. Clinical studies on the tonic effect of ginseng on work performance demonstrated that physical and mental dysfunction induced by various stresses are improved by increasing adaptability of physical condition. However, the results obtained from clinical studies cannot be mentioned in the indication, which are variable upon the scientist who performed those studies. In this respect, standardized ginseng product and providing planning of the systematic clinical research in double-blind randomized controlled trials are needed to assess the real efficacy for proposing ginseng indication. Pharmacological mode of action of ginseng has not yet been fully elucidated. Pharmacodynamic and pharmacokinetic researches reveal that the role of ginseng not seem to be confined to a given single organ. It has been known that ginseng plays a beneficial role in such general organs as central nervous, endocrine, metabolic, immune systems, which means ginseng improves general physical and mental conditons. Such multivalent effect of ginseng can be attributed to the main active component of ginseng,ginsenosides or non-saponin compounds which are also recently suggested to be another active ingredients. As is generally the similar case with other herbal medicines, effects of ginseng cannot be attributed as a given single compound or group of components. Diversified ingredients play synergistic or antagonistic role each other and act in harmonized manner. A few cases of adverse effect in clinical uses are reported, however, it is not observed when standardized ginseng products are used and recommended dose was administered. Unfavorable interaction with other drugs has also been suggested, which the information on the products and administered dosage are not available. However, efficacy, safety, interaction or contraindication with other medicines has to be more intensively investigated in order to promote clinical application of ginseng. For example, daily recommended doses per day are not agreement as 1-2g in the West and 3-6 g in the Orient. Duration of administration also seems variable according to the purpose. Two to three months are generally recommended to feel the benefit but time- and dose-dependent effects of ginseng still need to be solved from now on. Furthermore, the effect of ginsenosides transformed by the intestinal microflora, and differential effect associated with ginsenosides content and its composition also should be clinically evaluated in the future. In conclusion, the more wide-spread use of ginseng as a herbal medicine or nutraceutical supplement warrants the more rigorous investigations to assess its effacy and safety. In addition, a careful quality control of ginseng preparations should be done to ensure an acceptable standardization of commercial products.

Export Control System based on Case Based Reasoning: Design and Evaluation (사례 기반 지능형 수출통제 시스템 : 설계와 평가)

  • Hong, Woneui;Kim, Uihyun;Cho, Sinhee;Kim, Sansung;Yi, Mun Yong;Shin, Donghoon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.109-131
    • /
    • 2014
  • As the demand of nuclear power plant equipment is continuously growing worldwide, the importance of handling nuclear strategic materials is also increasing. While the number of cases submitted for the exports of nuclear-power commodity and technology is dramatically increasing, preadjudication (or prescreening to be simple) of strategic materials has been done so far by experts of a long-time experience and extensive field knowledge. However, there is severe shortage of experts in this domain, not to mention that it takes a long time to develop an expert. Because human experts must manually evaluate all the documents submitted for export permission, the current practice of nuclear material export is neither time-efficient nor cost-effective. Toward alleviating the problem of relying on costly human experts only, our research proposes a new system designed to help field experts make their decisions more effectively and efficiently. The proposed system is built upon case-based reasoning, which in essence extracts key features from the existing cases, compares the features with the features of a new case, and derives a solution for the new case by referencing similar cases and their solutions. Our research proposes a framework of case-based reasoning system, designs a case-based reasoning system for the control of nuclear material exports, and evaluates the performance of alternative keyword extraction methods (full automatic, full manual, and semi-automatic). A keyword extraction method is an essential component of the case-based reasoning system as it is used to extract key features of the cases. The full automatic method was conducted using TF-IDF, which is a widely used de facto standard method for representative keyword extraction in text mining. TF (Term Frequency) is based on the frequency count of the term within a document, showing how important the term is within a document while IDF (Inverted Document Frequency) is based on the infrequency of the term within a document set, showing how uniquely the term represents the document. The results show that the semi-automatic approach, which is based on the collaboration of machine and human, is the most effective solution regardless of whether the human is a field expert or a student who majors in nuclear engineering. Moreover, we propose a new approach of computing nuclear document similarity along with a new framework of document analysis. The proposed algorithm of nuclear document similarity considers both document-to-document similarity (${\alpha}$) and document-to-nuclear system similarity (${\beta}$), in order to derive the final score (${\gamma}$) for the decision of whether the presented case is of strategic material or not. The final score (${\gamma}$) represents a document similarity between the past cases and the new case. The score is induced by not only exploiting conventional TF-IDF, but utilizing a nuclear system similarity score, which takes the context of nuclear system domain into account. Finally, the system retrieves top-3 documents stored in the case base that are considered as the most similar cases with regard to the new case, and provides them with the degree of credibility. With this final score and the credibility score, it becomes easier for a user to see which documents in the case base are more worthy of looking up so that the user can make a proper decision with relatively lower cost. The evaluation of the system has been conducted by developing a prototype and testing with field data. The system workflows and outcomes have been verified by the field experts. This research is expected to contribute the growth of knowledge service industry by proposing a new system that can effectively reduce the burden of relying on costly human experts for the export control of nuclear materials and that can be considered as a meaningful example of knowledge service application.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Development of a Stock Trading System Using M & W Wave Patterns and Genetic Algorithms (M&W 파동 패턴과 유전자 알고리즘을 이용한 주식 매매 시스템 개발)

  • Yang, Hoonseok;Kim, Sunwoong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.63-83
    • /
    • 2019
  • Investors prefer to look for trading points based on the graph shown in the chart rather than complex analysis, such as corporate intrinsic value analysis and technical auxiliary index analysis. However, the pattern analysis technique is difficult and computerized less than the needs of users. In recent years, there have been many cases of studying stock price patterns using various machine learning techniques including neural networks in the field of artificial intelligence(AI). In particular, the development of IT technology has made it easier to analyze a huge number of chart data to find patterns that can predict stock prices. Although short-term forecasting power of prices has increased in terms of performance so far, long-term forecasting power is limited and is used in short-term trading rather than long-term investment. Other studies have focused on mechanically and accurately identifying patterns that were not recognized by past technology, but it can be vulnerable in practical areas because it is a separate matter whether the patterns found are suitable for trading. When they find a meaningful pattern, they find a point that matches the pattern. They then measure their performance after n days, assuming that they have bought at that point in time. Since this approach is to calculate virtual revenues, there can be many disparities with reality. The existing research method tries to find a pattern with stock price prediction power, but this study proposes to define the patterns first and to trade when the pattern with high success probability appears. The M & W wave pattern published by Merrill(1980) is simple because we can distinguish it by five turning points. Despite the report that some patterns have price predictability, there were no performance reports used in the actual market. The simplicity of a pattern consisting of five turning points has the advantage of reducing the cost of increasing pattern recognition accuracy. In this study, 16 patterns of up conversion and 16 patterns of down conversion are reclassified into ten groups so that they can be easily implemented by the system. Only one pattern with high success rate per group is selected for trading. Patterns that had a high probability of success in the past are likely to succeed in the future. So we trade when such a pattern occurs. It is a real situation because it is measured assuming that both the buy and sell have been executed. We tested three ways to calculate the turning point. The first method, the minimum change rate zig-zag method, removes price movements below a certain percentage and calculates the vertex. In the second method, high-low line zig-zag, the high price that meets the n-day high price line is calculated at the peak price, and the low price that meets the n-day low price line is calculated at the valley price. In the third method, the swing wave method, the high price in the center higher than n high prices on the left and right is calculated as the peak price. If the central low price is lower than the n low price on the left and right, it is calculated as valley price. The swing wave method was superior to the other methods in the test results. It is interpreted that the transaction after checking the completion of the pattern is more effective than the transaction in the unfinished state of the pattern. Genetic algorithms(GA) were the most suitable solution, although it was virtually impossible to find patterns with high success rates because the number of cases was too large in this simulation. We also performed the simulation using the Walk-forward Analysis(WFA) method, which tests the test section and the application section separately. So we were able to respond appropriately to market changes. In this study, we optimize the stock portfolio because there is a risk of over-optimized if we implement the variable optimality for each individual stock. Therefore, we selected the number of constituent stocks as 20 to increase the effect of diversified investment while avoiding optimization. We tested the KOSPI market by dividing it into six categories. In the results, the portfolio of small cap stock was the most successful and the high vol stock portfolio was the second best. This shows that patterns need to have some price volatility in order for patterns to be shaped, but volatility is not the best.