• Title/Summary/Keyword: open world

Search Result 874, Processing Time 0.023 seconds

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Stock-Index Invest Model Using News Big Data Opinion Mining (뉴스와 주가 : 빅데이터 감성분석을 통한 지능형 투자의사결정모형)

  • Kim, Yoo-Sin;Kim, Nam-Gyu;Jeong, Seung-Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.143-156
    • /
    • 2012
  • People easily believe that news and stock index are closely related. They think that securing news before anyone else can help them forecast the stock prices and enjoy great profit, or perhaps capture the investment opportunity. However, it is no easy feat to determine to what extent the two are related, come up with the investment decision based on news, or find out such investment information is valid. If the significance of news and its impact on the stock market are analyzed, it will be possible to extract the information that can assist the investment decisions. The reality however is that the world is inundated with a massive wave of news in real time. And news is not patterned text. This study suggests the stock-index invest model based on "News Big Data" opinion mining that systematically collects, categorizes and analyzes the news and creates investment information. To verify the validity of the model, the relationship between the result of news opinion mining and stock-index was empirically analyzed by using statistics. Steps in the mining that converts news into information for investment decision making, are as follows. First, it is indexing information of news after getting a supply of news from news provider that collects news on real-time basis. Not only contents of news but also various information such as media, time, and news type and so on are collected and classified, and then are reworked as variable from which investment decision making can be inferred. Next step is to derive word that can judge polarity by separating text of news contents into morpheme, and to tag positive/negative polarity of each word by comparing this with sentimental dictionary. Third, positive/negative polarity of news is judged by using indexed classification information and scoring rule, and then final investment decision making information is derived according to daily scoring criteria. For this study, KOSPI index and its fluctuation range has been collected for 63 days that stock market was open during 3 months from July 2011 to September in Korea Exchange, and news data was collected by parsing 766 articles of economic news media M company on web page among article carried on stock information>news>main news of portal site Naver.com. In change of the price index of stocks during 3 months, it rose on 33 days and fell on 30 days, and news contents included 197 news articles before opening of stock market, 385 news articles during the session, 184 news articles after closing of market. Results of mining of collected news contents and of comparison with stock price showed that positive/negative opinion of news contents had significant relation with stock price, and change of the price index of stocks could be better explained in case of applying news opinion by deriving in positive/negative ratio instead of judging between simplified positive and negative opinion. And in order to check whether news had an effect on fluctuation of stock price, or at least went ahead of fluctuation of stock price, in the results that change of stock price was compared only with news happening before opening of stock market, it was verified to be statistically significant as well. In addition, because news contained various type and information such as social, economic, and overseas news, and corporate earnings, the present condition of type of industry, market outlook, the present condition of market and so on, it was expected that influence on stock market or significance of the relation would be different according to the type of news, and therefore each type of news was compared with fluctuation of stock price, and the results showed that market condition, outlook, and overseas news was the most useful to explain fluctuation of news. On the contrary, news about individual company was not statistically significant, but opinion mining value showed tendency opposite to stock price, and the reason can be thought to be the appearance of promotional and planned news for preventing stock price from falling. Finally, multiple regression analysis and logistic regression analysis was carried out in order to derive function of investment decision making on the basis of relation between positive/negative opinion of news and stock price, and the results showed that regression equation using variable of market conditions, outlook, and overseas news before opening of stock market was statistically significant, and classification accuracy of logistic regression accuracy results was shown to be 70.0% in rise of stock price, 78.8% in fall of stock price, and 74.6% on average. This study first analyzed relation between news and stock price through analyzing and quantifying sensitivity of atypical news contents by using opinion mining among big data analysis techniques, and furthermore, proposed and verified smart investment decision making model that could systematically carry out opinion mining and derive and support investment information. This shows that news can be used as variable to predict the price index of stocks for investment, and it is expected the model can be used as real investment support system if it is implemented as system and verified in the future.

Various Life Conditions of Actors of Joseon Periods in Unofficial Historical Stories (야담 문학에 나타난 조선 배우의 삶)

  • Choi, Nakyong
    • (The) Research of the performance art and culture
    • /
    • no.23
    • /
    • pp.281-312
    • /
    • 2011
  • The aim of this study is to examine various life conditions of actors of Joseon periods in unofficial historical stories. Yadam Literature(Korean unofficial historical stories) had been collected Sadaebu(the past Korean nobility and Confucian intelligentsia) among the people that stories had been handed down orally. and they had been wrote them. So Yadam Literature was heterozygous between the folk culture and the ruling class. And it was mixed and adapted legends and folktales, adding literary imagination. had a decisive role to cultivating novel that owed much to prosaic inspiration during A. D. 18~19. Besides, set a high value on excellent novel itself. Yadam Literature had a verisimilitude because it described a contemporary reality as it was founded on freely prosaic inspiration. In those days, so called Suchok and Seunggwangdae had performed Uhee(a comic theatrical performance) in Joseon periods. Suchok was the lowest class of people and Seunggwangdae was performing Buddhist monk in that time. Uhee had performed three kinds of comedies. One satirized and insinuated kings. Other satirized corrupt officials, too. Another had mimic everything. It is famous at that time as a king knew repertoire. Confucian scholars very were fond of Uhee in those ages. Because they favored a criticism of Uhee's satire. They thought that it gave people good lesson or instruction. Heri Bergson said that comic and Humor included lesson. At that time, those thought were universal in the world whether east or west. At any rate, I classify six kinds of types Uhee in Yadam Literature. First, satirizing and accusing corrupt officials. Second, an actor who use a satire in order to appeal secure a government position of his lord to a king. Third, shamans and actors who use a satire in order to appeal sufferings themselves to a king. Forth, actors and performing Buddhist monks that skillfully mimic anything. Fifth, describing actor's extremely miserable life. Sixth, wit and humor of actors. The contents of Uhee were various. Korean traditional actors adeptly dealt with aspects comic of wit, satire, humor, etc. Sometimes they used changeable transition them. By doing that, a great number of people enjoyed fully the sense of freedom. Korean traditional actors were the lowest class of people. They had lived extremely miserable life. But they had been exist as actions, interactions, and relationship in society those days. they were not only open to people, but also might foster community to peoples.

The Effect of Users' Personality on Emotional and Cognitive Evaluation in UCC Web Site Usage (UCC(user-created-contents) 웹 사이트에서 사용자의 인성이 감정적, 인지적 평가와 UCC 활용에 미치는 영향)

  • Moon, Yun-Ji;Kang, So-Ra;Kim, Woo-Gon
    • Asia pacific journal of information systems
    • /
    • v.20 no.3
    • /
    • pp.167-190
    • /
    • 2010
  • The research conducted here focuses on the effect of factors that affect the behavior of UCC (User Created Content) website users, other than user's rational recognition of how useful a UCC website can be. Most discussions in the existing literature on information systems have focused on users' evaluation how a UCC website can help to attain the users' own goals. However, there are other factors and this research pays attention to an individual's 'personality,' which is stable and biological in nature. Specifically, I have noted here that 'extroversion' and 'neuroticism,' the two common personality factors presented in Eysenck's most representative 'EPQ Model' and 'Big Five Model,' are the two personality factors that affect a site's 'usefulness,' by this I mean how useful does the user consider the website and its content. How useful a site is considered by the user is the other factor that has been regarded as the antecedent factor that influences the adoption of information systems in the existing MIS (Management Information System) research. Secondly, as using or creating a UCC website does not guarantee the user's or the creator's extrinsic motivation, unlike when using the information system within an organization, there is a greater likelihood that the increase in user's activities in relation to a UCC website is motivated by emotional factors rather than rational factors. Thus, I have decided to include the relationship between an individual's personality and what they find pleasurable in the research model. Thirdly, when based on the S-O-R Paradigm of Mehrabian and Russell, the two cognitive factors and emotional factors are finally affected by stimulus, and thus these factors ultimately have an effect on an individual's respondent behavior. Therefore, this research has presented an assumption that the recognition of how useful the site and content is and what emotional pleasure it provides will finally affect the behavior of the UCC website users. Finally, the relationship between the recognition of how useful a site is and how pleasurable it is to useand UCC usage may differ depending on certain situational conditions. In other words, the relationship between the three factors may vary according to how much users are involved in the creation of the website content. Creation thus emerges as the keyword of UCC. I analyzed the above relationships through the moderating variable of the user's involvement in the creation of the site. The research result shows the following: When it comes to the relationship between an individual's personality and what they find pleasurable it is extroverted users who have a greater likelihood to feel pleasure when using a UCC website, as was expected in this research. This in turn leads to a more active usage of the UCC web site because a person who is an extrovert likes to spend time on activities with other people, is sensitive to new experiences and stimuli and thus actively responds to these. An extroverted person accepts new UCC activities as part of his/her social life, rather than getting away from this new UCC environment. This is represented by the term 'Foxonomy' where the users meet a variety of users from all over the world and contact new types of content created by these users. However, neuroticism creates the opposite situation to that created by extroversion. The representative symptoms of neuroticism are instability, stress, and tension. These dispositions are more closely related to stress caused by a new environment rather than this creatingcuriosity or pleasure. Thus, neurotic persons have an uneasy feeling and will eventually avoid the situation where their own or others' daily lives are frequently exposed to the open web environment, this eventually makes them have a negative attitude towards the web environment. When it comes to an individual's personality and how useful site is, the two personality factors of extroversion and neuroticism both have a positive relationship with the recognition of how useful the site and its content is. The positive, curious, and social dispositions of extroverted persons tend to make them consider the future usefulness and possibilities of a new type of information system, or website, based on their positive attitude, which has a significant influence on the recognition of how useful these UCC sites are. Neuroticism also favorably affects how useful a UCC website can be through a different mechanism from that of extroversion. As the neurotic persons tend to feel uneasy and have much doubt about a new type of information system, they actively explore its usefulness in order to relieve their uncomfortable feelings. In other words, neurotic persons seek out how useful a site can be in order to secure their own stable feelings. Meanwhile, extroverted persons explore how useful a site can be because of their positive attitude and curiosity. As a lot of MIS research has revealed that the recognition of how useful a site can be and how pleasurable it can be to use have been proven to have a significant effect on UCC activity. However, the relationship between these factors reveals different aspects based on the user's involvement in creation. This factor of creationgauges the interest of users in the creation of UCC contents. Involvement is a variable that shows the level of an individual's mental effort in creating UCC contents. When a user is highly involved in the creation process and makes an enormous effort to create UCC content (classed a part of a high-involvement group), their own pleasure and recognition of how useful the site is have a significantly higher effect on the future usage of the UCC contents, more significantly than the users who sit back and just retrieve the UCC content created by others. The cognitive and emotional response of those in the low-involvement group is unlikely to last long,even if they recognize the contents of a UCC website is pleasurable and useful to them. However, the high-involvement group tends to participate in the creation and the usage of UCC more favorably, connecting the experience with their own goals. In this respect, this research presents an answer to the question; why so many people are participating in the usage of UCC, the representative form of the Web 2.0 that has drastically involved more and more people in the creation of UCC, even if they cannot gain any monetary or social compensation. Neither information system nor a website can succeed unless it secures a certain level of user base. Moreover, it cannot be further developed when the reasons, or problems, for people's participation are not suitably explored, even if it has a certain user base. Thus, what is significant in this research is that it has studied users' respondent behavior based on an individual's innate personality, emotion, and cognitive interaction, unlike the existing research that has focused on 'compensation' to explain users' participation with the UCC website. There are also limitations in this research. Firstly, I divided an individual's personality into extroversion and neuroticism; however, there are many other personal factors such as neuro-psychiatricism, which also needs to be analyzed for its influence on UCC activities. Secondly, as a UCC website comes in many types such as multimedia, Wikis, and podcasting, these types need to be included as a sub-category of the UCC websites and their relationship with personality, emotion, cognition, and behavior also needs to be analyzed.