• Title/Summary/Keyword: Improved Support Vector Machine

Search Result 141, Processing Time 0.037 seconds

A Novel Grasshopper Optimization-based Particle Swarm Algorithm for Effective Spectrum Sensing in Cognitive Radio Networks

  • Ashok, J;Sowmia, KR;Jayashree, K;Priya, Vijay
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.520-541
    • /
    • 2023
  • In CRNs, SS is of utmost significance. Every CR user generates a sensing report during the training phase beneath various circumstances, and depending on a collective process, either communicates or remains silent. In the training stage, the fusion centre combines the local judgments made by CR users by a majority vote, and then returns a final conclusion to every CR user. Enough data regarding the environment, including the activity of PU and every CR's response to that activity, is acquired and sensing classes are created during the training stage. Every CR user compares their most recent sensing report to the previous sensing classes during the classification stage, and distance vectors are generated. The posterior probability of every sensing class is derived on the basis of quantitative data, and the sensing report is then classified as either signifying the presence or absence of PU. The ISVM technique is utilized to compute the quantitative variables necessary to compute the posterior probability. Here, the iterations of SVM are tuned by novel GO-PSA by combining GOA and PSO. Novel GO-PSA is developed since it overcomes the problem of computational complexity, returns minimum error, and also saves time when compared with various state-of-the-art algorithms. The dependability of every CR user is taken into consideration as these local choices are then integrated at the fusion centre utilizing an innovative decision combination technique. Depending on the collective choice, the CR users will then communicate or remain silent.

Evaluation of Classification Models of Mild Left Ventricular Diastolic Dysfunction by Tei Index (Tei Index를 이용한 경도의 좌심실 이완 기능 장애 분류 모델 평가)

  • Su-Min Kim;Soo-Young Ye
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.5
    • /
    • pp.761-766
    • /
    • 2023
  • In this paper, TI was measured to classify the presence or absence of mild left ventricular diastolic dysfunction. Of the total 306 data, 206 were used as training data and 100 were used as test data, and the machine learning models used for classification used SVM and KNN. As a result, it was confirmed that SVM showed relatively higher accuracy than KNN and was more useful in diagnosing the presence of left ventricular diastolic dysfunction. In future research, it is expected that classification performance can be further improved by adding various indicators that evaluate not only TI but also cardiac function and securing more data. Furthermore, it is expected to be used as basic data to predict and classify other diseases and solve the problem of insufficient medical manpower compared to the increasing number of tests.

Emotion Prediction System using Movie Script and Cinematography (영화 시나리오와 영화촬영기법을 이용한 감정 예측 시스템)

  • Kim, Jinsu
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.12
    • /
    • pp.33-38
    • /
    • 2018
  • Recently, we are trying to predict the emotion from various information and to convey the emotion information that the supervisor wants to inform the audience. In addition, audiences intend to understand the flow of emotions through various information of non-dialogue parts, such as cinematography, scene background, background sound and so on. In this paper, we propose to extract emotions by mixing not only the context of scripts but also the cinematography information such as color, background sound, composition, arrangement and so on. In other words, we propose an emotional prediction system that learns and distinguishes various emotional expression techniques into dialogue and non-dialogue regions, contributes to the completeness of the movie, and quickly applies them to new changes. The precision of the proposed system is improved by about 5.1% and 0.4%, and the recall is improved by about 4.3% and 1.6%, respectively, when compared with the modified n-gram and morphological analysis.

Dynamic forecasts of bankruptcy with Recurrent Neural Network model (RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구)

  • Kwon, Hyukkun;Lee, Dongkyu;Shin, Minsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.139-153
    • /
    • 2017
  • Corporate bankruptcy can cause great losses not only to stakeholders but also to many related sectors in society. Through the economic crises, bankruptcy have increased and bankruptcy prediction models have become more and more important. Therefore, corporate bankruptcy has been regarded as one of the major topics of research in business management. Also, many studies in the industry are in progress and important. Previous studies attempted to utilize various methodologies to improve the bankruptcy prediction accuracy and to resolve the overfitting problem, such as Multivariate Discriminant Analysis (MDA), Generalized Linear Model (GLM). These methods are based on statistics. Recently, researchers have used machine learning methodologies such as Support Vector Machine (SVM), Artificial Neural Network (ANN). Furthermore, fuzzy theory and genetic algorithms were used. Because of this change, many of bankruptcy models are developed. Also, performance has been improved. In general, the company's financial and accounting information will change over time. Likewise, the market situation also changes, so there are many difficulties in predicting bankruptcy only with information at a certain point in time. However, even though traditional research has problems that don't take into account the time effect, dynamic model has not been studied much. When we ignore the time effect, we get the biased results. So the static model may not be suitable for predicting bankruptcy. Thus, using the dynamic model, there is a possibility that bankruptcy prediction model is improved. In this paper, we propose RNN (Recurrent Neural Network) which is one of the deep learning methodologies. The RNN learns time series data and the performance is known to be good. Prior to experiment, we selected non-financial firms listed on the KOSPI, KOSDAQ and KONEX markets from 2010 to 2016 for the estimation of the bankruptcy prediction model and the comparison of forecasting performance. In order to prevent a mistake of predicting bankruptcy by using the financial information already reflected in the deterioration of the financial condition of the company, the financial information was collected with a lag of two years, and the default period was defined from January to December of the year. Then we defined the bankruptcy. The bankruptcy we defined is the abolition of the listing due to sluggish earnings. We confirmed abolition of the list at KIND that is corporate stock information website. Then we selected variables at previous papers. The first set of variables are Z-score variables. These variables have become traditional variables in predicting bankruptcy. The second set of variables are dynamic variable set. Finally we selected 240 normal companies and 226 bankrupt companies at the first variable set. Likewise, we selected 229 normal companies and 226 bankrupt companies at the second variable set. We created a model that reflects dynamic changes in time-series financial data and by comparing the suggested model with the analysis of existing bankruptcy predictive models, we found that the suggested model could help to improve the accuracy of bankruptcy predictions. We used financial data in KIS Value (Financial database) and selected Multivariate Discriminant Analysis (MDA), Generalized Linear Model called logistic regression (GLM), Support Vector Machine (SVM), Artificial Neural Network (ANN) model as benchmark. The result of the experiment proved that RNN's performance was better than comparative model. The accuracy of RNN was high in both sets of variables and the Area Under the Curve (AUC) value was also high. Also when we saw the hit-ratio table, the ratio of RNNs that predicted a poor company to be bankrupt was higher than that of other comparative models. However the limitation of this paper is that an overfitting problem occurs during RNN learning. But we expect to be able to solve the overfitting problem by selecting more learning data and appropriate variables. From these result, it is expected that this research will contribute to the development of a bankruptcy prediction by proposing a new dynamic model.

Improved Estimation of Hourly Surface Ozone Concentrations using Stacking Ensemble-based Spatial Interpolation (스태킹 앙상블 모델을 이용한 시간별 지상 오존 공간내삽 정확도 향상)

  • KIM, Ye-Jin;KANG, Eun-Jin;CHO, Dong-Jin;LEE, Si-Woo;IM, Jung-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.25 no.3
    • /
    • pp.74-99
    • /
    • 2022
  • Surface ozone is produced by photochemical reactions of nitrogen oxides(NOx) and volatile organic compounds(VOCs) emitted from vehicles and industrial sites, adversely affecting vegetation and the human body. In South Korea, ozone is monitored in real-time at stations(i.e., point measurements), but it is difficult to monitor and analyze its continuous spatial distribution. In this study, surface ozone concentrations were interpolated to have a spatial resolution of 1.5km every hour using the stacking ensemble technique, followed by a 5-fold cross-validation. Base models for the stacking ensemble were cokriging, multi-linear regression(MLR), random forest(RF), and support vector regression(SVR), while MLR was used as the meta model, having all base model results as additional input variables. The results showed that the stacking ensemble model yielded the better performance than the individual base models, resulting in an averaged R of 0.76 and RMSE of 0.0065ppm during the study period of 2020. The surface ozone concentration distribution generated by the stacking ensemble model had a wider range with a spatial pattern similar with terrain and urbanization variables, compared to those by the base models. Not only should the proposed model be capable of producing the hourly spatial distribution of ozone, but it should also be highly applicable for calculating the daily maximum 8-hour ozone concentrations.

Reliability of mortar filling layer void length in in-service ballastless track-bridge system of HSR

  • Binbin He;Sheng Wen;Yulin Feng;Lizhong Jiang;Wangbao Zhou
    • Steel and Composite Structures
    • /
    • v.47 no.1
    • /
    • pp.91-102
    • /
    • 2023
  • To study the evaluation standard and control limit of mortar filling layer void length, in this paper, the train sub-model was developed by MATLAB and the track-bridge sub-model considering the mortar filling layer void was established by ANSYS. The two sub-models were assembled into a train-track-bridge coupling dynamic model through the wheel-rail contact relationship, and the validity was corroborated by the coupling dynamic model with the literature model. Considering the randomness of fastening stiffness, mortar elastic modulus, length of mortar filling layer void, and pier settlement, the test points were designed by the Box-Behnken method based on Design-Expert software. The coupled dynamic model was calculated, and the support vector regression (SVR) nonlinear mapping model of the wheel-rail system was established. The learning, prediction, and verification were carried out. Finally, the reliable probability of the amplification coefficient distribution of the response index of the train and structure in different ranges was obtained based on the SVR nonlinear mapping model and Latin hypercube sampling method. The limit of the length of the mortar filling layer void was, thus, obtained. The results show that the SVR nonlinear mapping model developed in this paper has a high fitting accuracy of 0.993, and the computational efficiency is significantly improved by 99.86%. It can be used to calculate the dynamic response of the wheel-rail system. The length of the mortar filling layer void significantly affects the wheel-rail vertical force, wheel weight load reduction ratio, rail vertical displacement, and track plate vertical displacement. The dynamic response of the track structure has a more significant effect on the limit value of the length of the mortar filling layer void than the dynamic response of the vehicle, and the rail vertical displacement is the most obvious. At 250 km/h - 350 km/h train running speed, the limit values of grade I, II, and III of the lengths of the mortar filling layer void are 3.932 m, 4.337 m, and 4.766 m, respectively. The results can provide some reference for the long-term service performance reliability of the ballastless track-bridge system of HRS.

Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution (불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측)

  • Kim, Eunmi;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.29-45
    • /
    • 2015
  • Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.

Improved Sentence Boundary Detection Method for Web Documents (웹 문서를 위한 개선된 문장경계인식 방법)

  • Lee, Chung-Hee;Jang, Myung-Gil;Seo, Young-Hoon
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.455-463
    • /
    • 2010
  • In this paper, we present an approach to sentence boundary detection for web documents that builds on statistical-based methods and uses rule-based correction. The proposed system uses the classification model learned offline using a training set of human-labeled web documents. The web documents have many word-spacing errors and frequently no punctuation mark that indicates the end of sentence boundary. As sentence boundary candidates, the proposed method considers every Ending Eomis as well as punctuation marks. We optimize engine performance by selecting the best feature, the best training data, and the best classification algorithm. For evaluation, we made two test sets; Set1 consisting of articles and blog documents and Set2 of web community documents. We use F-measure to compare results on a large variety of tasks, Detecting only periods as sentence boundary, our basis engine showed 96.5% in Set1 and 56.7% in Set2. We improved our basis engine by adapting features and the boundary search algorithm. For the final evaluation, we compared our adaptation engine with our basis engine in Set2. As a result, the adaptation engine obtained improvements over the basis engine by 39.6%. We proved the effectiveness of the proposed method in sentence boundary detection.

An Accurate Cryptocurrency Price Forecasting using Reverse Walk-Forward Validation (역순 워크 포워드 검증을 이용한 암호화폐 가격 예측)

  • Ahn, Hyun;Jang, Baekcheol
    • Journal of Internet Computing and Services
    • /
    • v.23 no.4
    • /
    • pp.45-55
    • /
    • 2022
  • The size of the cryptocurrency market is growing. For example, market capitalization of bitcoin exceeded 500 trillion won. Accordingly, many studies have been conducted to predict the price of cryptocurrency, and most of them have similar methodology of predicting stock prices. However, unlike stock price predictions, machine learning become best model in cryptocurrency price predictions, conceptually cryptocurrency has no passive income from ownership, and statistically, cryptocurrency has at least three times higher liquidity than stocks. Thats why we argue that a methodology different from stock price prediction should be applied to cryptocurrency price prediction studies. We propose Reverse Walk-forward Validation (RWFV), which modifies Walk-forward Validation (WFV). Unlike WFV, RWFV measures accuracy for Validation by pinning the Validation dataset directly in front of the Test dataset in time series, and gradually increasing the size of the Training dataset in front of it in time series. Train data were cut according to the size of the Train dataset with the highest accuracy among all measured Validation accuracy, and then combined with Validation data to measure the accuracy of the Test data. Logistic regression analysis and Support Vector Machine (SVM) were used as the analysis model, and various algorithms and parameters such as L1, L2, rbf, and poly were applied for the reliability of our proposed RWFV. As a result, it was confirmed that all analysis models showed improved accuracy compared to existing studies, and on average, the accuracy increased by 1.23%p. This is a significant improvement in accuracy, given that most of the accuracy of cryptocurrency price prediction remains between 50% and 60% through previous studies.

QSPR analysis for predicting heat of sublimation of organic compounds (유기화합물의 승화열 예측을 위한 QSPR분석)

  • Park, Yu Sun;Lee, Jong Hyuk;Park, Han Woong;Lee, Sung Kwang
    • Analytical Science and Technology
    • /
    • v.28 no.3
    • /
    • pp.187-195
    • /
    • 2015
  • The heat of sublimation (HOS) is an essential parameter used to resolve environmental problems in the transfer of organic contaminants to the atmosphere and to assess the risk of toxic chemicals. The experimental measurement of the heat of sublimation is time-consuming, expensive, and complicated. In this study, quantitative structural property relationships (QSPR) were used to develop a simple and predictive model for measuring the heat of sublimation of organic compounds. The population-based forward selection method was applied to select an informative subset of descriptors of learning algorithms, such as by using multiple linear regression (MLR) and the support vector machine (SVM) method. Each individual model and consensus model was evaluated by internal validation using the bootstrap method and y-randomization. The predictions of the performance of the external test set were improved by considering their applicability to the domain. Based on the results of the MLR model, we showed that the heat of sublimation was related to dispersion, H-bond, electrostatic forces, and the dipole-dipole interaction between inter-molecules.