• Title/Summary/Keyword: First-Level Categories

Search Result 323, Processing Time 0.02 seconds

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Suggestion of Urban Regeneration Type Recommendation System Based on Local Characteristics Using Text Mining (텍스트 마이닝을 활용한 지역 특성 기반 도시재생 유형 추천 시스템 제안)

  • Kim, Ikjun;Lee, Junho;Kim, Hyomin;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.149-169
    • /
    • 2020
  • "The Urban Renewal New Deal project", one of the government's major national projects, is about developing underdeveloped areas by investing 50 trillion won in 100 locations on the first year and 500 over the next four years. This project is drawing keen attention from the media and local governments. However, the project model which fails to reflect the original characteristics of the area as it divides project area into five categories: "Our Neighborhood Restoration, Housing Maintenance Support Type, General Neighborhood Type, Central Urban Type, and Economic Base Type," According to keywords for successful urban regeneration in Korea, "resident participation," "regional specialization," "ministerial cooperation" and "public-private cooperation", when local governments propose urban regeneration projects to the government, they can see that it is most important to accurately understand the characteristics of the city and push ahead with the projects in a way that suits the characteristics of the city with the help of local residents and private companies. In addition, considering the gentrification problem, which is one of the side effects of urban regeneration projects, it is important to select and implement urban regeneration types suitable for the characteristics of the area. In order to supplement the limitations of the 'Urban Regeneration New Deal Project' methodology, this study aims to propose a system that recommends urban regeneration types suitable for urban regeneration sites by utilizing various machine learning algorithms, referring to the urban regeneration types of the '2025 Seoul Metropolitan Government Urban Regeneration Strategy Plan' promoted based on regional characteristics. There are four types of urban regeneration in Seoul: "Low-use Low-Level Development, Abandonment, Deteriorated Housing, and Specialization of Historical and Cultural Resources" (Shon and Park, 2017). In order to identify regional characteristics, approximately 100,000 text data were collected for 22 regions where the project was carried out for a total of four types of urban regeneration. Using the collected data, we drew key keywords for each region according to the type of urban regeneration and conducted topic modeling to explore whether there were differences between types. As a result, it was confirmed that a number of topics related to real estate and economy appeared in old residential areas, and in the case of declining and underdeveloped areas, topics reflecting the characteristics of areas where industrial activities were active in the past appeared. In the case of the historical and cultural resource area, since it is an area that contains traces of the past, many keywords related to the government appeared. Therefore, it was possible to confirm political topics and cultural topics resulting from various events. Finally, in the case of low-use and under-developed areas, many topics on real estate and accessibility are emerging, so accessibility is good. It mainly had the characteristics of a region where development is planned or is likely to be developed. Furthermore, a model was implemented that proposes urban regeneration types tailored to regional characteristics for regions other than Seoul. Machine learning technology was used to implement the model, and training data and test data were randomly extracted at an 8:2 ratio and used. In order to compare the performance between various models, the input variables are set in two ways: Count Vector and TF-IDF Vector, and as Classifier, there are 5 types of SVM (Support Vector Machine), Decision Tree, Random Forest, Logistic Regression, and Gradient Boosting. By applying it, performance comparison for a total of 10 models was conducted. The model with the highest performance was the Gradient Boosting method using TF-IDF Vector input data, and the accuracy was 97%. Therefore, the recommendation system proposed in this study is expected to recommend urban regeneration types based on the regional characteristics of new business sites in the process of carrying out urban regeneration projects."

A Study on the Changes in Gwi-po from Tang to Jin Dynasty in China - Focusing on the connection type of Jwau-dae(左右隊) - (중국 당대~금대 목조 건축의 귀포 변천에 관한 연구 - 좌우대의 결구 유형을 중심으로 -)

  • Lee, Byung-Chun;Lee, Ho-Yeol
    • Korean Journal of Heritage: History & Science
    • /
    • v.48 no.3
    • /
    • pp.96-119
    • /
    • 2015
  • This research has studied the changes of Gwi-po(轉角包) by taking the cases of China's medieval wooden buildings as objects. The purpose of the study is to examine the time-periodic transition process of Gwi-po through the cases of 71 wooden buildings which were built from Tang(唐) dynasty(AD 618~690 & 705~907) until Jin(金) dynasty(AD 1115~1234) and also designated as 'Major Historical and Cultural Sites Protected at the National Level'. This research has taken note of various frame types of Jwau-dae(左右隊), which are architectural components of Gwi-po, to study the changes and development process of Gwi-po. The results are as follows. An important factor in the transformations of Gwi-po format is the changes in perception of the craftsmen about Jwau-dae, who took charge in the building process. In the early periods, the principles of Yidou sanshen dougong(一斗三升) in constructing ancons of Gwi-po had been well-maintained, while there appeared many different types of Gwi-po in later periods, due to the usage of Jwau-dae and $Shu{\check{a}}$ $t{\acute{o}}u$(?頭) in each Chulmok of Gwi-po. Transitional types of Gwi-po, which were evolved from the earlier ones, are divided into 3 categories by different forms of Jwau-dae, placed on odd number stages. The first one is 'none-$f{\bar{a}}ng$ $t{\acute{o}}u$(無枋頭) type' of Song(AD 960~1127, 1127~1279) and Liao dynasty(AD 907~1125) buildings, which doesn't have $f{\bar{a}}ng$ $t{\acute{o}}u$(枋頭)s, for the reason that Jwau-dae(左右隊) is in direct contact with Gwihan-dae(耳限大). The second one is '$Shu{\check{a}}$ $t{\acute{o}}u$ $f{\bar{a}}ng$ $t{\acute{o}}u$(?頭枋頭) type' of Song(AD 960~1127, 1127~1279) and Jin dynasty(AD 1115~1234), that has $f{\bar{a}}ng$ $t{\acute{o}}u$(枋頭)s of Jwau-dae(左右隊) identical to $Shu{\check{a}}$ $t{\acute{o}}u$(?頭) in form. The last one is '$Xi{\check{a}}o$ $g{\check{o}}ng$ $t{\acute{o}}u$(小?頭) type' of Jin(AD 1115~1234) and Yuan dynasty(AD 1271~1368), which has $f{\bar{a}}ng$ $t{\acute{o}}u$(枋頭)s of Jwau-dae identical to $Xi{\check{a}}o$ $g{\check{o}}ng$ $t{\acute{o}}u$(小?頭) in form. The earlier forms of Gwi-po, which appeared between Tang dynasty(AD 618~690 & 705~907) and Five Dynasties periods(907~960) went through transitional forms of 'non-$f{\bar{a}}ng$ $t{\acute{o}}u$(無枋頭) type', '$Shu{\check{a}}$ $t{\acute{o}}u$ $f{\bar{a}}ng$ $t{\acute{o}}u$(?頭枋頭) type' and '$Xi{\check{a}}o$ $g{\check{o}}ng$ $t{\acute{o}}u$(小?頭) type' and finally had its form settled between Yuan(元, AD 1271~1368) and Ming(明. AD 1368~1644) dynasty periods. In Liao(遼) dynasty period(AD 907~1125), as the buildings got bigger and the tendency of longer eave-exposure was implemented, there grew a certain need to structurally reinforce Gwi-po, on which load of the whole roof is concentrated. Especially, the transition from Tōuxīn $z{\grave{a}}o$(偸心造) style to Jì xīn $z{\grave{a}}o$(計心造) style in this period had a great influence on standardization of Gwi-po, along with None-${\acute{A}}ng$(無仰) style. Furthermore, Wing-type Gong(翼型?), which developed in Liao dynasty(AD 907~1125), is also thought to have had a great influence on the transition from Tōuxīn $z{\grave{a}}o$(偸心造) style to Jì xīn $z{\grave{a}}o$(計心造) style by changing the forms of Gongs(?), such as Gwi-po. However, unlike None-${\acute{A}}ng$(無仰) style, there occurred a gradual change from '$Shu{\check{a}}$ $t{\acute{o}}u$ $f{\bar{a}}ng$ $t{\acute{o}}u$(?頭枋頭) type' to '$Xi{\check{a}}o$ $g{\check{o}}ng$ $t{\acute{o}}u$(小?頭) type' of Gwi-po in $Xi{\grave{a}}$ ${\acute{a}}ng$ style.