• Title/Summary/Keyword: Classification accuracy

Search Result 3,065, Processing Time 0.026 seconds

LSTM-based Deep Learning for Time Series Forecasting: The Case of Corporate Credit Score Prediction (시계열 예측을 위한 LSTM 기반 딥러닝: 기업 신용평점 예측 사례)

  • Lee, Hyun-Sang;Oh, Sehwan
    • The Journal of Information Systems
    • /
    • v.29 no.1
    • /
    • pp.241-265
    • /
    • 2020
  • Purpose Various machine learning techniques are used to implement for predicting corporate credit. However, previous research doesn't utilize time series input features and has a limited prediction timing. Furthermore, in the case of corporate bond credit rating forecast, corporate sample is limited because only large companies are selected for corporate bond credit rating. To address limitations of prior research, this study attempts to implement a predictive model with more sample companies, which can adjust the forecasting point at the present time by using the credit score information and corporate information in time series. Design/methodology/approach To implement this forecasting model, this study uses the sample of 2,191 companies with KIS credit scores for 18 years from 2000 to 2017. For improving the performance of the predictive model, various financial and non-financial features are applied as input variables in a time series through a sliding window technique. In addition, this research also tests various machine learning techniques that were traditionally used to increase the validity of analysis results, and the deep learning technique that is being actively researched of late. Findings RNN-based stateful LSTM model shows good performance in credit rating prediction. By extending the forecasting time point, we find how the performance of the predictive model changes over time and evaluate the feature groups in the short and long terms. In comparison with other studies, the results of 5 classification prediction through label reclassification show good performance relatively. In addition, about 90% accuracy is found in the bad credit forecasts.

Comparison of field- and satellite-based vegetation cover estimation methods

  • Ko, Dongwook W.;Kim, Dasom;Narantsetseg, Amartuvshin;Kang, Sinkyu
    • Journal of Ecology and Environment
    • /
    • v.41 no.2
    • /
    • pp.34-44
    • /
    • 2017
  • Background: Monitoring terrestrial vegetation cover condition is important to evaluate its current condition and to identify potential vulnerabilities. Due to simplicity and low cost, point intercept method has been widely used in evaluating grassland surface and quantifying cover conditions. Field-based digital photography method is gaining popularity for the purpose of cover estimate, as it can reduce field time and enable additional analysis in the future. However, the caveats and uncertainty among field-based vegetation cover estimation methods is not well known, especially across a wide range of cover conditions. We compared cover estimates from point intercept and digital photography methods with varying sampling intensities (25, 49, and 100 points within an image), across 61 transects in typical steppe, forest steppe, and desert steppe in central Mongolia. We classified three photosynthetic groups of cover important to grassland ecosystem functioning: photosynthetic vegetation, non-photosynthetic vegetation, and bare soil. We also acquired normalized difference vegetation index from satellite image comparison with the field-based cover. Results: Photosynthetic vegetation estimates by point intercept method were correlated with normalized difference vegetation index, with improvement when non-photosynthetic vegetation was combined. For digital photography method, photosynthetic and non-photosynthetic vegetation estimates showed no correlation with normalized difference vegetation index, but combining of both showed moderate and significant correlation, which slightly increased with greater sampling intensity. Conclusions: Results imply that varying greenness is playing an important role in classification accuracy confusion. We suggest adopting measures to reduce observer bias and better distinguishing greenness levels in combination with multispectral indices to improve estimates on dry matter.

Feature Selection Method by Information Theory and Particle S warm Optimization (상호정보량과 Binary Particle Swarm Optimization을 이용한 속성선택 기법)

  • Cho, Jae-Hoon;Lee, Dae-Jong;Song, Chang-Kyu;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.2
    • /
    • pp.191-196
    • /
    • 2009
  • In this paper, we proposed a feature selection method using Binary Particle Swarm Optimization(BPSO) and Mutual information. This proposed method consists of the feature selection part for selecting candidate feature subset by mutual information and the optimal feature selection part for choosing optimal feature subset by BPSO in the candidate feature subsets. In the candidate feature selection part, we computed the mutual information of all features, respectively and selected a candidate feature subset by the ranking of mutual information. In the optimal feature selection part, optimal feature subset can be found by BPSO in the candidate feature subset. In the BPSO process, we used multi-object function to optimize both accuracy of classifier and selected feature subset size. DNA expression dataset are used for estimating the performance of the proposed method. Experimental results show that this method can achieve better performance for pattern recognition problems than conventional ones.

User Recognition Method using Human Body Impulse Response Signals (인체의 임펄스 응답 신호를 이용한 사용자 인식 방법)

  • Park, Beom-Su;Kang, Eun-Jung;Kang, Taewook;Lee, Jae-Jin;Kim, Seong-Eun
    • Journal of IKEEE
    • /
    • v.24 no.1
    • /
    • pp.120-126
    • /
    • 2020
  • We present a user recognition method using human body impulse response signals. The body compositions vary from person to person depending on the portion of water, muscle, and fat. In the body communication study, the body has been interpreted circuit models using capacitance and resistances, and its characteristics are determined by the body compositions. Therefore, the individual body channel is unique and can be used for user recognition. In this paper, we applied pseudo impulse signals to the left hand and recorded received signals from the right hand. The empirical mode decomposition (EMD) method removed noise from the received signals and 10 peak values are extracted. We set the differences between peak amplitudes as a key feature to identify individuals. We collected data from 6 subjects and achieved accuracy of 97.71% for the user recognition application.

Combined Application of Data Imbalance Reduction Techniques Using Genetic Algorithm (유전자 알고리즘을 활용한 데이터 불균형 해소 기법의 조합적 활용)

  • Jang, Young-Sik;Kim, Jong-Woo;Hur, Joon
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.3
    • /
    • pp.133-154
    • /
    • 2008
  • The data imbalance problem which can be uncounted in data mining classification problems typically means that there are more or less instances in a class than those in other classes. In order to solve the data imbalance problem, there has been proposed a number of techniques based on re-sampling with replacement, adjusting decision thresholds, and adjusting the cost of the different classes. In this paper, we study the feasibility of the combination usage of the techniques previously proposed to deal with the data imbalance problem, and suggest a combination method using genetic algorithm to find the optimal combination ratio of the techniques. To improve the prediction accuracy of a minority class, we determine the combination ratio based on the F-value of the minority class as the fitness function of genetic algorithm. To compare the performance with those of single techniques and the matrix-style combination of random percentage, we performed experiments using four public datasets which has been generally used to compare the performance of methods for the data imbalance problem. From the results of experiments, we can find the usefulness of the proposed method.

  • PDF

A study on the identity theft detection model in MMORPGs (MMORPG 게임 내 계정도용 탐지 모델에 관한 연구)

  • Kim, Hana;Kwak, Byung Il;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.3
    • /
    • pp.627-637
    • /
    • 2015
  • As game item trading becomes more popular with the rapid growth of online game market, the market for trading game items by cash has increased up to KRW 1.6 trillion. Thanks to this active market, it has been easy to turn these items and game money into real money. As a result, some malicious users have often attempted to steal other players' rare and valuable game items by using their account. Therefore, this study proposes a detection model through analysis on these account thieves' behavior in the Massive Multiuser Online Role Playing Game(MMORPG). In case of online game identity theft, the thieves engage in economic activities only with a goal of stealing game items and game money. In this pattern are found particular sequences such as item production, item sales and acquisition of game money. Based on this pattern, this study proposes a detection model. This detection model-based classification revealed 86 percent of accuracy. In addition, trading patterns when online game identity was stolen were analyzed in this study.

Korean Groal Potential Habitat Suitability Model at Soraksan National Park Using Fuzzy Set and Multi-Criteria Evaluation (설악산국립공원내 산양(Nemorhaedus Caudatus Raddeanus)의 잠재 서식지 적합성 모형; 다기준평가기법(MCE)과 퍼지집합(Fuzzy Set)의 도입을 통하여)

  • Choi Tae-Young;Park Chong-Hwa
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.32 no.4
    • /
    • pp.28-38
    • /
    • 2004
  • Korean goral (Nemorhaedus caudatus raddeanus) is one of the endangered species in Korea, and the rugged terrain of the Soraksan National Park (373㎢) is a critical habitat for the species. But the goral population is threatened by habitat fragmentation caused by roads and hiking trails. The objective of this study was to develop a potential habitat suitability model for Korean goral in the park, and the model was based on the concepts of fuzzy set theory and multi-criteria evaluation. The process of the suitability modeling could be divided into three steps. First, data for the modeling was collected by using field work and a literature survey. Collected data included 204 points of GPS data obtained through a goral trace survey and through the number of daily visitors to each hiking trail during the peak season of the park. Second, fuzzy set theory was employed for building a GIS data base related to environmental factors affecting the suitability of the goral habitat. Finally, a multiple-criteria evaluation was performed as the final step towards a goral habitat suitability model. The results of the study were as follows. First, characteristics of suitable habitats were the proximity to rock cliffs, scattered pine (Pinus densiflora) patches, ridges, the elevation of 700∼800m, and the aspect of south and southeast. Second, the habitat suitability model had a high classification accuracy of 93.9% for the analysis site, and 95.7% for the validation site at a cut off value of 0.5. Finally, 11.7% of habitatwith more than 0.5 of habitat suitability index was affected by roads and hiking trails in the park.

Performance Improvement of a Real-time Traffic Identification System on a Multi-core CPU Environment (멀티 코어 환경에서 실시간 트래픽 분석 시스템 처리속도 향상)

  • Yoon, Sung-Ho;Park, Jun-Sang;Kim, Myung-Sup
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.5B
    • /
    • pp.348-356
    • /
    • 2012
  • The application traffic analysis is getting more and more challenging due to the huge amount of traffic from high-speed network link and variety of applications running on wired and wireless Internet devices. Multi-level combination of various analysis methods is desired to achieve high completeness and accuracy of analysis results for a real-time analysis system, while requires much of processing burden on the contrary. This paper proposes a novel architecture for a real-time traffic analysis system which improves the processing performance on multi-core CPU environment. The main contribution of the proposed architecture is an efficient parallel processing mechanism with multiple threads of various analysis methods. The feasibility of the proposed architecture was proved by implementing and deploying it on our campus network.

Analysis of the Characteristics of the Older Adults with Depression Using Data Mining Decision Tree Analysis (의사결정나무 분석법을 활용한 우울 노인의 특성 분석)

  • Park, Myonghwa;Choi, Sora;Shin, A Mi;Koo, Chul Hoi
    • Journal of Korean Academy of Nursing
    • /
    • v.43 no.1
    • /
    • pp.1-10
    • /
    • 2013
  • Purpose: The purpose of this study was to develop a prediction model for the characteristics of older adults with depression using the decision tree method. Methods: A large dataset from the 2008 Korean Elderly Survey was used and data of 14,970 elderly people were analyzed. Target variable was depression and 53 input variables were general characteristics, family & social relationship, economic status, health status, health behavior, functional status, leisure & social activity, quality of life, and living environment. Data were analyzed by decision tree analysis, a data mining technique using SPSS Window 19.0 and Clementine 12.0 programs. Results: The decision trees were classified into five different rules to define the characteristics of older adults with depression. Classification & Regression Tree (C&RT) showed the best prediction with an accuracy of 80.81% among data mining models. Factors in the rules were life satisfaction, nutritional status, daily activity difficulty due to pain, functional limitation for basic or instrumental daily activities, number of chronic diseases and daily activity difficulty due to disease. Conclusion: The different rules classified by the decision tree model in this study should contribute as baseline data for discovering informative knowledge and developing interventions tailored to these individual characteristics.

Classifying Finger Flexing Motions with Surface EMG Using Entropy and The Maximum Likelihood Method (엔트로피 및 최대우도추정법을 이용한 표면 근전도 기반 손가락 동작 인식)

  • You, Kyung-Jin;Shin, Hyun-Chool
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.46 no.6
    • /
    • pp.38-43
    • /
    • 2009
  • We provide a method to infer finger flexing motions using a 4-channel surface electromyogram (sEMG). Surface EMGs are harmless to the human body and easily acquired. However, they do not reflect the activity of specific nerves or muscles, unlike invasive EMGs. On the other hand, the non-invasive type is difficult to use for discriminating various motions while using only a small number of electrodes. Surface EMG data in this study were obtained from four electrodes placed around the forearm. The motions were the flexion of the thumb, index, middle, ring, and little linger. One subject was trained with these motions and another left was untrained. The maximum likelihood estimation was used to infer the finger motion. Experimental results have showed that this method could be useful for recognizing finger motions. The average accuracy was as high as 95%.