Search | Korea Science

A Study on the Performance Evaluation of Machine Learning for Predicting the Number of Movie Audiences (영화 관객 수 예측을 위한 기계학습 기법의 성능 평가 연구)

Jeong, Chan-Mi;Min, Daiki
- The Journal of Society for e-Business Studies
- /
- v.25 no.2
- /
- pp.49-63
- /
- 2020
The accurate prediction of box office in the early stage is crucial for film industry to make better managerial decision. With aims to improve the prediction performance, the purpose of this paper is to evaluate the use of machine learning methods. We tested both classification and regression based methods including k-NN, SVM and Random Forest. We first evaluate input variables, which show that reputation-related information generated during the first two-week period after release is significant. Prediction test results show that regression based methods provides lower prediction error, and Random Forest particularly outperforms other machine learning methods. Regression based method has better prediction power when films have small box office earnings. On the other hand, classification based method works better for predicting large box office earnings.
https://doi.org/10.7838/jsebs.2020.25.2.049 인용 PDF KSCI

A Study on Statistical Forecasting Models of PM10 in Pohang Region by the Variable Transformation (변수변환을 통한 포항지역 미세먼지의 통계적 예보모형에 관한 연구)

Lee, Yung-Seop;Kim, Hyun-Goo;Park, Jong-Seok;Kim, Hee-Kyung
- Journal of Korean Society for Atmospheric Environment
- /
- v.22 no.5
- /
- pp.614-626
- /
- 2006
Using the data of three environmental monitoring sites in Pohang area(KME112, KME113, and KME114), statistical forecasting models of the daily maximum and mean values of PM10 have been developed. Since the distributions of the daily maximum and mean PM10 values are skewed, which are similar to the Weibull distribution, these values were log-transformed to increase prediction accuracy by approximating the normal distribution. Three statistical forecasting models, which are regression, neural networks(NN) and support vector regression(SVR), were built using the log-transformed response variables, i.e., log(max(PM10)) or log(mean (PM10)). Also, the forecasting models were validated by the measure of RMSE, CORR, and IOA for the model comparison and accuracy. The improvement rate of IOA before and after the log-transformation in the daily maximum PM10 prediction was 12.7% for the regression and 22.5% for NN. In particular, 42.7% was improved for SVR method. In the case of the daily mean PM10 prediction, IOA value was improved by 5.1% for regression, 6.5% for NN, and 6.3% for SVR method. As a conclusion, SVR method was found to be performed better than the other methods in the point of the model accuracy and fitness views.
PDF KSCI

Development of Data Mining Algorithm for Implementation of Fine Dust Numerical Prediction Model (미세먼지 수치 예측 모델 구현을 위한 데이터마이닝 알고리즘 개발)

Cha, Jinwook;Kim, Jangyoung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.22 no.4
- /
- pp.595-601
- /
- 2018
Recently, as the fine dust level has risen rapidly, there is a great interest. Exposure to fine dust is associated with the development of respiratory and cardiovascular diseases and has been reported to increase death rate. In addition, there exist damage to fine dusts continues at industrial sites. However, exposure to fine dust is inevitable in modern life. Therefore, predicting and minimizing exposure to fine dust is the most efficient way to reduce health and industrial damages. Existing fine dust prediction model is estimated as good, normal, poor, and very bad, depending on the concentration range of the fine dust rather than the concentration value. In this paper, we study and implement to predict the PM10 level by applying the Artificial neural network algorithm and the K-Nearest Neighbor algorithm, which are machine learning algorithms, using the actual weather and air quality data.
https://doi.org/10.6109/jkiice.2018.22.4.595 인용 PDF KSCI

Text Categorization Using TextRank Algorithm (TextRank 알고리즘을 이용한 문서 범주화)

Bae, Won-Sik;Cha, Jeong-Won
- Journal of KIISE:Computing Practices and Letters
- /
- v.16 no.1
- /
- pp.110-114
- /
- 2010
We describe a new method for text categorization using TextRank algorithm. Text categorization is a problem that over one pre-defined categories are assigned to a text document. TextRank algorithm is a graph-based ranking algorithm. If we consider that each word is a vertex, and co-occurrence of two adjacent words is a edge, we can get a graph from a document. After that, we find important words using TextRank algorithm from the graph and make feature which are pairs of words which are each important word and a word adjacent to the important word. We use classifiers: SVM, Na$\ddot{i}$ve Bayesian classifier, Maximum Entropy Model, and k-NN classifier. We use non-cross-posted version of 20 Newsgroups data set. In consequence, we had an improved performance in whole classifiers, and the result tells that is a possibility of TextRank algorithm in text categorization.
PDF KSCI

Bias corrected imputation method for non-ignorable non-response (무시할 수 없는 무응답에서 편향 보정을 이용한 무응답 대체)

Lee, Min-Ha;Shin, Key-Il
- The Korean Journal of Applied Statistics
- /
- v.35 no.4
- /
- pp.485-499
- /
- 2022
Controlling the total survey error including sampling error and non-sampling error is very important in sampling design. Non-sampling error caused by non-response accounts for a large proportion of the total survey error. Many studies have been conducted to handle non-response properly. Recently, a lot of non-response imputation methods using machine learning technique and traditional statistical methods have been studied and practically used. Most imputation methods assume MCAR(missing completely at random) or MAR(missing at random) and few studies have been conducted focusing on MNAR (missing not at random) or NN(non-ignorable non-response) which cause bias and reduce the accuracy of imputation. In this study, we propose a non-response imputation method that can be applied to non-ignorable non-response. That is, we propose an imputation method to improve the accuracy of estimation by removing the bias caused by NN. In addition, the superiority of the proposed method is confirmed through small simulation studies.
https://doi.org/10.5351/KJAS.2022.35.4.485 인용 PDF KSCI

Predicting sorptivity and freeze-thaw resistance of self-compacting mortar by using deep learning and k-nearest neighbor

Turk, Kazim;Kina, Ceren;Tanyildizi, Harun
- Computers and Concrete
- /
- v.30 no.2
- /
- pp.99-111
- /
- 2022
In this study, deep learning and k-Nearest Neighbor (kNN) models were used to estimate the sorptivity and freeze-thaw resistance of self-compacting mortars (SCMs) having binary and ternary blends of mineral admixtures. Twenty-five environment-friendly SCMs were designed as binary and ternary blends of fly ash (FA) and silica fume (SF) except for control mixture with only Portland cement (PC). The capillary water absorption and freeze-thaw resistance tests were conducted for 91 days. It was found that the use of SF with FA as ternary blends reduced sorptivity coefficient values compared to the use of FA as binary blends while the presence of FA with SF improved freeze-thaw resistance of SCMs with ternary blends. The input variables used the models for the estimation of sorptivity were defined as PC content, SF content, FA content, sand content, HRWRA, water/cementitious materials (W/C) and freeze-thaw cycles. The input variables used the models for the estimation of sorptivity were selected as PC content, SF content, FA content, sand content, HRWRA, W/C and predefined intervals of the sample in water. The deep learning and k-NN models estimated the durability factor of SCM with 94.43% and 92.55% accuracy and the sorptivity of SCM was estimated with 97.87% and 86.14% accuracy, respectively. This study found that deep learning model estimated the sorptivity and durability factor of SCMs having binary and ternary blends of mineral admixtures higher accuracy than k-NN model.
https://doi.org/10.12989/cac.2022.30.2.099 인용 KSCI

Development of a Resignation Prediction Model using HR Data (HR 데이터 기반의 퇴사 예측 모델 개발)

PARK, YUNJUNG;Lee, Do-Gil
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2021.05a
- /
- pp.100-103
- /
- 2021
Most companies study why employees resign their jobs to prevent the outflow of excellent human resources. To obtain the data needed for the study, employees are interviewed or surveyed before resignation. However, it is difficult to get accurate results because employees do not want to express their opinions that may be disadvantageous to working in a survey. Meanwhile, according to the data released by the Korea Labor Institute, the greater the difference between the minimum level of education required by companies and the level of employees' academic background, the greater the tendency to resign jobs. Therefore, based on these data, in this study, we would like to predict whether employees will leave the company based on data such as major, education level and company type. We generate four kinds of resignation prediction models using Decision Tree, XGBoost, kNN and SVM, and compared their respective performance. As a result, we could identify various factors that were not covered in previous study. It is expected that the resignation prediction model help companies recognize employees who intend to leave the company in advance.
PDF

A vibration-based approach for detecting arch dam damage using RBF neural networks and Jaya algorithms

Ali Zar;Zahoor Hussain;Muhammad Akbar;Bassam A. Tayeh;Zhibin Lin
- Smart Structures and Systems
- /
- v.32 no.5
- /
- pp.319-338
- /
- 2023
The study presents a new hybrid data-driven method by combining radial basis functions neural networks (RBF-NN) with the Jaya algorithm (JA) to provide effective structural health monitoring of arch dams. The novelty of this approach lies in that only one user-defined parameter is required and thus can increase its effectiveness and efficiency, as compared to other machine learning techniques that often require processing a large amount of training and testing model parameters and hyper-parameters, with high time-consuming. This approach seeks rapid damage detection in arch dams under dynamic conditions, to prevent potential disasters, by utilizing the RBF-NNN to seamlessly integrate the dynamic elastic modulus (DEM) and modal parameters (such as natural frequency and mode shape) as damage indicators. To determine the dynamic characteristics of the arch dam, the JA sequentially optimizes an objective function rooted in vibration-based data sets. Two case studies of hyperbolic concrete arch dams were carefully designed using finite element simulation to demonstrate the effectiveness of the RBF-NN model, in conjunction with the Jaya algorithm. The testing results demonstrated that the proposed methods could exhibit significant computational time-savings, while effectively detecting damage in arch dam structures with complex nonlinearities. Furthermore, despite training data contaminated with a high level of noise, the RBF-NN and JA fusion remained the robustness, with high accuracy.
https://doi.org/10.12989/sss.2023.32.5.319 인용

Traffic Rout Choice by means of Fuzzy Identification (퍼지 동정에 의한 교통경로선택)

오성권;남궁문;안태천
- Journal of the Korean Institute of Intelligent Systems
- /
- v.6 no.2
- /
- pp.81-89
- /
- 1996
A design method of fuzzy modeling is presented for the model identification of route choice of traffic problems.The proposed fuzzy modeling implements system structure and parameter identification in the eficient form of""IF..., THEN-.."", using the theories of optimization theory, linguistic fuzzy implication rules. Three kinds ofmethod for fuzzy modeling presented in this paper include simplified inference (type I), linear inference (type 21,and proposed modified-linear inference (type 3). The fuzzy inference method are utilized to develop the routechoice model in terms of accurate estimation and precise description of human travel behavior. In order to identifypremise structure and parameter of fuzzy implication rules, improved complex method is used and the least squaremethod is utilized for the identification of optimum consequence parameters. Data for route choice of trafficproblems are used to evaluate the performance of the proposed fuzzy modeling. The results show that the proposedmethod can produce the fuzzy model with higher accuracy than previous other studies -BL(binary logic) model,B(production system) model, FL(fuzzy logic) model, NN(neura1 network) model, and FNNs (fuzzy-neuralnetworks) model -.fuzzy-neural networks) model -.
PDF

Comparison Studies of Hybrid and Non-hybrid Forecasting Models for Seasonal and Trend Time Series Data (트렌드와 계절성을 가진 시계열에 대한 순수 모형과 하이브리드 모형의 비교 연구)

Jeong, Chulwoo;Kim, Myung Suk
- Journal of Intelligence and Information Systems
- /
- v.19 no.1
- /
- pp.1-17
- /
- 2013
In this article, several types of hybrid forecasting models are suggested. In particular, hybrid models using the generalized additive model (GAM) are newly suggested as an alternative to those using neural networks (NN). The prediction performances of various hybrid and non-hybrid models are evaluated using simulated time series data. Five different types of seasonal time series data related to an additive or multiplicative trend are generated over different levels of noise, and applied to the forecasting evaluation. For the simulated data with only seasonality, the autoregressive (AR) model and the hybrid AR-AR model performed equivalently very well. On the other hand, if the time series data employed a trend, the SARIMA model and some hybrid SARIMA models equivalently outperformed the others. In the comparison of GAMs and NNs, regarding the seasonal additive trend data, the SARIMA-GAM evenly performed well across the full range of noise variation, whereas the SARIMA-NN showed good performance only when the noise level was trivial.
https://doi.org/10.13088/jiis.2013.19.1.001 인용 PDF KSCI

Search Result 280, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)