KSII Transactions on Internet and Information Systems (TIIS)
/
v.17
no.2
/
pp.520-541
/
2023
In CRNs, SS is of utmost significance. Every CR user generates a sensing report during the training phase beneath various circumstances, and depending on a collective process, either communicates or remains silent. In the training stage, the fusion centre combines the local judgments made by CR users by a majority vote, and then returns a final conclusion to every CR user. Enough data regarding the environment, including the activity of PU and every CR's response to that activity, is acquired and sensing classes are created during the training stage. Every CR user compares their most recent sensing report to the previous sensing classes during the classification stage, and distance vectors are generated. The posterior probability of every sensing class is derived on the basis of quantitative data, and the sensing report is then classified as either signifying the presence or absence of PU. The ISVM technique is utilized to compute the quantitative variables necessary to compute the posterior probability. Here, the iterations of SVM are tuned by novel GO-PSA by combining GOA and PSO. Novel GO-PSA is developed since it overcomes the problem of computational complexity, returns minimum error, and also saves time when compared with various state-of-the-art algorithms. The dependability of every CR user is taken into consideration as these local choices are then integrated at the fusion centre utilizing an innovative decision combination technique. Depending on the collective choice, the CR users will then communicate or remain silent.
In this paper, TI was measured to classify the presence or absence of mild left ventricular diastolic dysfunction. Of the total 306 data, 206 were used as training data and 100 were used as test data, and the machine learning models used for classification used SVM and KNN. As a result, it was confirmed that SVM showed relatively higher accuracy than KNN and was more useful in diagnosing the presence of left ventricular diastolic dysfunction. In future research, it is expected that classification performance can be further improved by adding various indicators that evaluate not only TI but also cardiac function and securing more data. Furthermore, it is expected to be used as basic data to predict and classify other diseases and solve the problem of insufficient medical manpower compared to the increasing number of tests.
Recently, we are trying to predict the emotion from various information and to convey the emotion information that the supervisor wants to inform the audience. In addition, audiences intend to understand the flow of emotions through various information of non-dialogue parts, such as cinematography, scene background, background sound and so on. In this paper, we propose to extract emotions by mixing not only the context of scripts but also the cinematography information such as color, background sound, composition, arrangement and so on. In other words, we propose an emotional prediction system that learns and distinguishes various emotional expression techniques into dialogue and non-dialogue regions, contributes to the completeness of the movie, and quickly applies them to new changes. The precision of the proposed system is improved by about 5.1% and 0.4%, and the recall is improved by about 4.3% and 1.6%, respectively, when compared with the modified n-gram and morphological analysis.
Corporate bankruptcy can cause great losses not only to stakeholders but also to many related sectors in society. Through the economic crises, bankruptcy have increased and bankruptcy prediction models have become more and more important. Therefore, corporate bankruptcy has been regarded as one of the major topics of research in business management. Also, many studies in the industry are in progress and important. Previous studies attempted to utilize various methodologies to improve the bankruptcy prediction accuracy and to resolve the overfitting problem, such as Multivariate Discriminant Analysis (MDA), Generalized Linear Model (GLM). These methods are based on statistics. Recently, researchers have used machine learning methodologies such as Support Vector Machine (SVM), Artificial Neural Network (ANN). Furthermore, fuzzy theory and genetic algorithms were used. Because of this change, many of bankruptcy models are developed. Also, performance has been improved. In general, the company's financial and accounting information will change over time. Likewise, the market situation also changes, so there are many difficulties in predicting bankruptcy only with information at a certain point in time. However, even though traditional research has problems that don't take into account the time effect, dynamic model has not been studied much. When we ignore the time effect, we get the biased results. So the static model may not be suitable for predicting bankruptcy. Thus, using the dynamic model, there is a possibility that bankruptcy prediction model is improved. In this paper, we propose RNN (Recurrent Neural Network) which is one of the deep learning methodologies. The RNN learns time series data and the performance is known to be good. Prior to experiment, we selected non-financial firms listed on the KOSPI, KOSDAQ and KONEX markets from 2010 to 2016 for the estimation of the bankruptcy prediction model and the comparison of forecasting performance. In order to prevent a mistake of predicting bankruptcy by using the financial information already reflected in the deterioration of the financial condition of the company, the financial information was collected with a lag of two years, and the default period was defined from January to December of the year. Then we defined the bankruptcy. The bankruptcy we defined is the abolition of the listing due to sluggish earnings. We confirmed abolition of the list at KIND that is corporate stock information website. Then we selected variables at previous papers. The first set of variables are Z-score variables. These variables have become traditional variables in predicting bankruptcy. The second set of variables are dynamic variable set. Finally we selected 240 normal companies and 226 bankrupt companies at the first variable set. Likewise, we selected 229 normal companies and 226 bankrupt companies at the second variable set. We created a model that reflects dynamic changes in time-series financial data and by comparing the suggested model with the analysis of existing bankruptcy predictive models, we found that the suggested model could help to improve the accuracy of bankruptcy predictions. We used financial data in KIS Value (Financial database) and selected Multivariate Discriminant Analysis (MDA), Generalized Linear Model called logistic regression (GLM), Support Vector Machine (SVM), Artificial Neural Network (ANN) model as benchmark. The result of the experiment proved that RNN's performance was better than comparative model. The accuracy of RNN was high in both sets of variables and the Area Under the Curve (AUC) value was also high. Also when we saw the hit-ratio table, the ratio of RNNs that predicted a poor company to be bankrupt was higher than that of other comparative models. However the limitation of this paper is that an overfitting problem occurs during RNN learning. But we expect to be able to solve the overfitting problem by selecting more learning data and appropriate variables. From these result, it is expected that this research will contribute to the development of a bankruptcy prediction by proposing a new dynamic model.
KIM, Ye-Jin;KANG, Eun-Jin;CHO, Dong-Jin;LEE, Si-Woo;IM, Jung-Ho
Journal of the Korean Association of Geographic Information Studies
/
v.25
no.3
/
pp.74-99
/
2022
Surface ozone is produced by photochemical reactions of nitrogen oxides(NOx) and volatile organic compounds(VOCs) emitted from vehicles and industrial sites, adversely affecting vegetation and the human body. In South Korea, ozone is monitored in real-time at stations(i.e., point measurements), but it is difficult to monitor and analyze its continuous spatial distribution. In this study, surface ozone concentrations were interpolated to have a spatial resolution of 1.5km every hour using the stacking ensemble technique, followed by a 5-fold cross-validation. Base models for the stacking ensemble were cokriging, multi-linear regression(MLR), random forest(RF), and support vector regression(SVR), while MLR was used as the meta model, having all base model results as additional input variables. The results showed that the stacking ensemble model yielded the better performance than the individual base models, resulting in an averaged R of 0.76 and RMSE of 0.0065ppm during the study period of 2020. The surface ozone concentration distribution generated by the stacking ensemble model had a wider range with a spatial pattern similar with terrain and urbanization variables, compared to those by the base models. Not only should the proposed model be capable of producing the hourly spatial distribution of ozone, but it should also be highly applicable for calculating the daily maximum 8-hour ozone concentrations.
To study the evaluation standard and control limit of mortar filling layer void length, in this paper, the train sub-model was developed by MATLAB and the track-bridge sub-model considering the mortar filling layer void was established by ANSYS. The two sub-models were assembled into a train-track-bridge coupling dynamic model through the wheel-rail contact relationship, and the validity was corroborated by the coupling dynamic model with the literature model. Considering the randomness of fastening stiffness, mortar elastic modulus, length of mortar filling layer void, and pier settlement, the test points were designed by the Box-Behnken method based on Design-Expert software. The coupled dynamic model was calculated, and the support vector regression (SVR) nonlinear mapping model of the wheel-rail system was established. The learning, prediction, and verification were carried out. Finally, the reliable probability of the amplification coefficient distribution of the response index of the train and structure in different ranges was obtained based on the SVR nonlinear mapping model and Latin hypercube sampling method. The limit of the length of the mortar filling layer void was, thus, obtained. The results show that the SVR nonlinear mapping model developed in this paper has a high fitting accuracy of 0.993, and the computational efficiency is significantly improved by 99.86%. It can be used to calculate the dynamic response of the wheel-rail system. The length of the mortar filling layer void significantly affects the wheel-rail vertical force, wheel weight load reduction ratio, rail vertical displacement, and track plate vertical displacement. The dynamic response of the track structure has a more significant effect on the limit value of the length of the mortar filling layer void than the dynamic response of the vehicle, and the rail vertical displacement is the most obvious. At 250 km/h - 350 km/h train running speed, the limit values of grade I, II, and III of the lengths of the mortar filling layer void are 3.932 m, 4.337 m, and 4.766 m, respectively. The results can provide some reference for the long-term service performance reliability of the ballastless track-bridge system of HRS.
Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.
In this paper, we present an approach to sentence boundary detection for web documents that builds on statistical-based methods and uses rule-based correction. The proposed system uses the classification model learned offline using a training set of human-labeled web documents. The web documents have many word-spacing errors and frequently no punctuation mark that indicates the end of sentence boundary. As sentence boundary candidates, the proposed method considers every Ending Eomis as well as punctuation marks. We optimize engine performance by selecting the best feature, the best training data, and the best classification algorithm. For evaluation, we made two test sets; Set1 consisting of articles and blog documents and Set2 of web community documents. We use F-measure to compare results on a large variety of tasks, Detecting only periods as sentence boundary, our basis engine showed 96.5% in Set1 and 56.7% in Set2. We improved our basis engine by adapting features and the boundary search algorithm. For the final evaluation, we compared our adaptation engine with our basis engine in Set2. As a result, the adaptation engine obtained improvements over the basis engine by 39.6%. We proved the effectiveness of the proposed method in sentence boundary detection.
The size of the cryptocurrency market is growing. For example, market capitalization of bitcoin exceeded 500 trillion won. Accordingly, many studies have been conducted to predict the price of cryptocurrency, and most of them have similar methodology of predicting stock prices. However, unlike stock price predictions, machine learning become best model in cryptocurrency price predictions, conceptually cryptocurrency has no passive income from ownership, and statistically, cryptocurrency has at least three times higher liquidity than stocks. Thats why we argue that a methodology different from stock price prediction should be applied to cryptocurrency price prediction studies. We propose Reverse Walk-forward Validation (RWFV), which modifies Walk-forward Validation (WFV). Unlike WFV, RWFV measures accuracy for Validation by pinning the Validation dataset directly in front of the Test dataset in time series, and gradually increasing the size of the Training dataset in front of it in time series. Train data were cut according to the size of the Train dataset with the highest accuracy among all measured Validation accuracy, and then combined with Validation data to measure the accuracy of the Test data. Logistic regression analysis and Support Vector Machine (SVM) were used as the analysis model, and various algorithms and parameters such as L1, L2, rbf, and poly were applied for the reliability of our proposed RWFV. As a result, it was confirmed that all analysis models showed improved accuracy compared to existing studies, and on average, the accuracy increased by 1.23%p. This is a significant improvement in accuracy, given that most of the accuracy of cryptocurrency price prediction remains between 50% and 60% through previous studies.
Park, Yu Sun;Lee, Jong Hyuk;Park, Han Woong;Lee, Sung Kwang
Analytical Science and Technology
/
v.28
no.3
/
pp.187-195
/
2015
The heat of sublimation (HOS) is an essential parameter used to resolve environmental problems in the transfer of organic contaminants to the atmosphere and to assess the risk of toxic chemicals. The experimental measurement of the heat of sublimation is time-consuming, expensive, and complicated. In this study, quantitative structural property relationships (QSPR) were used to develop a simple and predictive model for measuring the heat of sublimation of organic compounds. The population-based forward selection method was applied to select an informative subset of descriptors of learning algorithms, such as by using multiple linear regression (MLR) and the support vector machine (SVM) method. Each individual model and consensus model was evaluated by internal validation using the bootstrap method and y-randomization. The predictions of the performance of the external test set were improved by considering their applicability to the domain. Based on the results of the MLR model, we showed that the heat of sublimation was related to dispersion, H-bond, electrostatic forces, and the dipole-dipole interaction between inter-molecules.
본 웹사이트에 게시된 이메일 주소가 전자우편 수집 프로그램이나
그 밖의 기술적 장치를 이용하여 무단으로 수집되는 것을 거부하며,
이를 위반시 정보통신망법에 의해 형사 처벌됨을 유념하시기 바랍니다.
[게시일 2004년 10월 1일]
이용약관
제 1 장 총칙
제 1 조 (목적)
이 이용약관은 KoreaScience 홈페이지(이하 “당 사이트”)에서 제공하는 인터넷 서비스(이하 '서비스')의 가입조건 및 이용에 관한 제반 사항과 기타 필요한 사항을 구체적으로 규정함을 목적으로 합니다.
제 2 조 (용어의 정의)
① "이용자"라 함은 당 사이트에 접속하여 이 약관에 따라 당 사이트가 제공하는 서비스를 받는 회원 및 비회원을
말합니다.
② "회원"이라 함은 서비스를 이용하기 위하여 당 사이트에 개인정보를 제공하여 아이디(ID)와 비밀번호를 부여
받은 자를 말합니다.
③ "회원 아이디(ID)"라 함은 회원의 식별 및 서비스 이용을 위하여 자신이 선정한 문자 및 숫자의 조합을
말합니다.
④ "비밀번호(패스워드)"라 함은 회원이 자신의 비밀보호를 위하여 선정한 문자 및 숫자의 조합을 말합니다.
제 3 조 (이용약관의 효력 및 변경)
① 이 약관은 당 사이트에 게시하거나 기타의 방법으로 회원에게 공지함으로써 효력이 발생합니다.
② 당 사이트는 이 약관을 개정할 경우에 적용일자 및 개정사유를 명시하여 현행 약관과 함께 당 사이트의
초기화면에 그 적용일자 7일 이전부터 적용일자 전일까지 공지합니다. 다만, 회원에게 불리하게 약관내용을
변경하는 경우에는 최소한 30일 이상의 사전 유예기간을 두고 공지합니다. 이 경우 당 사이트는 개정 전
내용과 개정 후 내용을 명확하게 비교하여 이용자가 알기 쉽도록 표시합니다.
제 4 조(약관 외 준칙)
① 이 약관은 당 사이트가 제공하는 서비스에 관한 이용안내와 함께 적용됩니다.
② 이 약관에 명시되지 아니한 사항은 관계법령의 규정이 적용됩니다.
제 2 장 이용계약의 체결
제 5 조 (이용계약의 성립 등)
① 이용계약은 이용고객이 당 사이트가 정한 약관에 「동의합니다」를 선택하고, 당 사이트가 정한
온라인신청양식을 작성하여 서비스 이용을 신청한 후, 당 사이트가 이를 승낙함으로써 성립합니다.
② 제1항의 승낙은 당 사이트가 제공하는 과학기술정보검색, 맞춤정보, 서지정보 등 다른 서비스의 이용승낙을
포함합니다.
제 6 조 (회원가입)
서비스를 이용하고자 하는 고객은 당 사이트에서 정한 회원가입양식에 개인정보를 기재하여 가입을 하여야 합니다.
제 7 조 (개인정보의 보호 및 사용)
당 사이트는 관계법령이 정하는 바에 따라 회원 등록정보를 포함한 회원의 개인정보를 보호하기 위해 노력합니다. 회원 개인정보의 보호 및 사용에 대해서는 관련법령 및 당 사이트의 개인정보 보호정책이 적용됩니다.
제 8 조 (이용 신청의 승낙과 제한)
① 당 사이트는 제6조의 규정에 의한 이용신청고객에 대하여 서비스 이용을 승낙합니다.
② 당 사이트는 아래사항에 해당하는 경우에 대해서 승낙하지 아니 합니다.
- 이용계약 신청서의 내용을 허위로 기재한 경우
- 기타 규정한 제반사항을 위반하며 신청하는 경우
제 9 조 (회원 ID 부여 및 변경 등)
① 당 사이트는 이용고객에 대하여 약관에 정하는 바에 따라 자신이 선정한 회원 ID를 부여합니다.
② 회원 ID는 원칙적으로 변경이 불가하며 부득이한 사유로 인하여 변경 하고자 하는 경우에는 해당 ID를
해지하고 재가입해야 합니다.
③ 기타 회원 개인정보 관리 및 변경 등에 관한 사항은 서비스별 안내에 정하는 바에 의합니다.
제 3 장 계약 당사자의 의무
제 10 조 (KISTI의 의무)
① 당 사이트는 이용고객이 희망한 서비스 제공 개시일에 특별한 사정이 없는 한 서비스를 이용할 수 있도록
하여야 합니다.
② 당 사이트는 개인정보 보호를 위해 보안시스템을 구축하며 개인정보 보호정책을 공시하고 준수합니다.
③ 당 사이트는 회원으로부터 제기되는 의견이나 불만이 정당하다고 객관적으로 인정될 경우에는 적절한 절차를
거쳐 즉시 처리하여야 합니다. 다만, 즉시 처리가 곤란한 경우는 회원에게 그 사유와 처리일정을 통보하여야
합니다.
제 11 조 (회원의 의무)
① 이용자는 회원가입 신청 또는 회원정보 변경 시 실명으로 모든 사항을 사실에 근거하여 작성하여야 하며,
허위 또는 타인의 정보를 등록할 경우 일체의 권리를 주장할 수 없습니다.
② 당 사이트가 관계법령 및 개인정보 보호정책에 의거하여 그 책임을 지는 경우를 제외하고 회원에게 부여된
ID의 비밀번호 관리소홀, 부정사용에 의하여 발생하는 모든 결과에 대한 책임은 회원에게 있습니다.
③ 회원은 당 사이트 및 제 3자의 지적 재산권을 침해해서는 안 됩니다.
제 4 장 서비스의 이용
제 12 조 (서비스 이용 시간)
① 서비스 이용은 당 사이트의 업무상 또는 기술상 특별한 지장이 없는 한 연중무휴, 1일 24시간 운영을
원칙으로 합니다. 단, 당 사이트는 시스템 정기점검, 증설 및 교체를 위해 당 사이트가 정한 날이나 시간에
서비스를 일시 중단할 수 있으며, 예정되어 있는 작업으로 인한 서비스 일시중단은 당 사이트 홈페이지를
통해 사전에 공지합니다.
② 당 사이트는 서비스를 특정범위로 분할하여 각 범위별로 이용가능시간을 별도로 지정할 수 있습니다. 다만
이 경우 그 내용을 공지합니다.
제 13 조 (홈페이지 저작권)
① NDSL에서 제공하는 모든 저작물의 저작권은 원저작자에게 있으며, KISTI는 복제/배포/전송권을 확보하고
있습니다.
② NDSL에서 제공하는 콘텐츠를 상업적 및 기타 영리목적으로 복제/배포/전송할 경우 사전에 KISTI의 허락을
받아야 합니다.
③ NDSL에서 제공하는 콘텐츠를 보도, 비평, 교육, 연구 등을 위하여 정당한 범위 안에서 공정한 관행에
합치되게 인용할 수 있습니다.
④ NDSL에서 제공하는 콘텐츠를 무단 복제, 전송, 배포 기타 저작권법에 위반되는 방법으로 이용할 경우
저작권법 제136조에 따라 5년 이하의 징역 또는 5천만 원 이하의 벌금에 처해질 수 있습니다.
제 14 조 (유료서비스)
① 당 사이트 및 협력기관이 정한 유료서비스(원문복사 등)는 별도로 정해진 바에 따르며, 변경사항은 시행 전에
당 사이트 홈페이지를 통하여 회원에게 공지합니다.
② 유료서비스를 이용하려는 회원은 정해진 요금체계에 따라 요금을 납부해야 합니다.
제 5 장 계약 해지 및 이용 제한
제 15 조 (계약 해지)
회원이 이용계약을 해지하고자 하는 때에는 [가입해지] 메뉴를 이용해 직접 해지해야 합니다.
제 16 조 (서비스 이용제한)
① 당 사이트는 회원이 서비스 이용내용에 있어서 본 약관 제 11조 내용을 위반하거나, 다음 각 호에 해당하는
경우 서비스 이용을 제한할 수 있습니다.
- 2년 이상 서비스를 이용한 적이 없는 경우
- 기타 정상적인 서비스 운영에 방해가 될 경우
② 상기 이용제한 규정에 따라 서비스를 이용하는 회원에게 서비스 이용에 대하여 별도 공지 없이 서비스 이용의
일시정지, 이용계약 해지 할 수 있습니다.
제 17 조 (전자우편주소 수집 금지)
회원은 전자우편주소 추출기 등을 이용하여 전자우편주소를 수집 또는 제3자에게 제공할 수 없습니다.
제 6 장 손해배상 및 기타사항
제 18 조 (손해배상)
당 사이트는 무료로 제공되는 서비스와 관련하여 회원에게 어떠한 손해가 발생하더라도 당 사이트가 고의 또는 과실로 인한 손해발생을 제외하고는 이에 대하여 책임을 부담하지 아니합니다.
제 19 조 (관할 법원)
서비스 이용으로 발생한 분쟁에 대해 소송이 제기되는 경우 민사 소송법상의 관할 법원에 제기합니다.
[부 칙]
1. (시행일) 이 약관은 2016년 9월 5일부터 적용되며, 종전 약관은 본 약관으로 대체되며, 개정된 약관의 적용일 이전 가입자도 개정된 약관의 적용을 받습니다.