• Title/Summary/Keyword: machine learning for regression

Search Result 581, Processing Time 0.026 seconds

Estimating Regression Function with $\varepsilon-Insensitive$ Supervised Learning Algorithm

  • Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.477-483
    • /
    • 2004
  • One of the major paradigms for supervised learning in neural network community is back-propagation learning. The standard implementations of back-propagation learning are optimal under the assumptions of identical and independent Gaussian noise. In this paper, for regression function estimation, we introduce $\varepsilon-insensitive$ back-propagation learning algorithm, which corresponds to minimizing the least absolute error. We compare this algorithm with support vector machine(SVM), which is another $\varepsilon-insensitive$ supervised learning algorithm and has been very successful in pattern recognition and function estimation problems. For comparison, we consider a more realistic model would allow the noise variance itself to depend on the input variables.

  • PDF

Utilization of Simulation and Machine Learning to Analyze and Predict Win Rates of the Characters Battle

  • Kang, Hyun-Syug
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.7
    • /
    • pp.39-46
    • /
    • 2020
  • Recently, for designing virtual characters in the battle game field effectively, some methods are very needed to predicate the win rates of the battle of them efficiently. In this paper, we propose a method to solve this problem by combining simulation and machine learning. Firstly, a simulation is used to analyze the win rates of the battle of virtual characters in the battle game. In addition, we apply a regression model based machine learning scheme to predict win rates of the battle of virtual characters according to their abilities. Our experimental results using suggested method show that it is almost no difference between the win rates of the simulation and the prediction results using the machine learning scheme. And also, we can obtain good performance in the experiment using only simple regression based machine learning model.

Prediction of Residual Resistance Coefficient of Low-Speed Full Ships Using Hull Form Variables and Machine Learning Approaches (선형변수 기계학습 기법을 활용한 저속비대선의 잉여저항계수 추정)

  • Kim, Yoo-Chul;Yang, Kyung-Kyu;Kim, Myung-Soo;Lee, Young-Yeon;Kim, Kwang-Soo
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.57 no.6
    • /
    • pp.312-321
    • /
    • 2020
  • In this study, machine learning techniques were applied to predict the residual resistance coefficient (Cr) of low-speed full ships. The used machine learning methods are Ridge regression, support vector regression, random forest, neural network and their ensemble model. 19 hull form variables were used as input variables for machine learning methods. The hull form variables and Cr data obtained from 139 hull forms of KRISO database were used in analysis. 80 % of the total data were used as training models and the rest as validation. Some non-linear models showed the overfitted results and the ensemble model showed better results than others.

A Study on Prediction Techniques through Machine Learning of Real-time Solar Radiation in Jeju (제주 실시간 일사량의 기계학습 예측 기법 연구)

  • Lee, Young-Mi;Bae, Joo-Hyun;Park, Jeong-keun
    • Journal of Environmental Science International
    • /
    • v.26 no.4
    • /
    • pp.521-527
    • /
    • 2017
  • Solar radiation forecasts are important for predicting the amount of ice on road and the potential solar energy. In an attempt to improve solar radiation predictability in Jeju, we conducted machine learning with various data mining techniques such as tree models, conditional inference tree, random forest, support vector machines and logistic regression. To validate machine learning models, the results from the simulation was compared with the solar radiation data observed over Jeju observation site. According to the model assesment, it can be seen that the solar radiation prediction using random forest is the most effective method. The error rate proposed by random forest data mining is 17%.

Comparison of machine learning techniques to predict compressive strength of concrete

  • Dutta, Susom;Samui, Pijush;Kim, Dookie
    • Computers and Concrete
    • /
    • v.21 no.4
    • /
    • pp.463-470
    • /
    • 2018
  • In the present study, soft computing i.e., machine learning techniques and regression models algorithms have earned much importance for the prediction of the various parameters in different fields of science and engineering. This paper depicts that how regression models can be implemented for the prediction of compressive strength of concrete. Three models are taken into consideration for this; they are Gaussian Process for Regression (GPR), Multi Adaptive Regression Spline (MARS) and Minimax Probability Machine Regression (MPMR). Contents of cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate and age in days have been taken as inputs and compressive strength as output for GPR, MARS and MPMR models. A comparatively large set of data including 1030 normalized previously published results which were obtained from experiments were utilized. Here, a comparison is made between the results obtained from all the above mentioned models and the model which provides the best fit is established. The experimental results manifest that proposed models are robust for determination of compressive strength of concrete.

Machine Learning Algorithm for Estimating Ink Usage (머신러닝을 통한 잉크 필요량 예측 알고리즘)

  • Se Wook Kwon;Young Joo Hyun;Hyun Chul Tae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.1
    • /
    • pp.23-31
    • /
    • 2023
  • Research and interest in sustainable printing are increasing in the packaging printing industry. Currently, predicting the amount of ink required for each work is based on the experience and intuition of field workers. Suppose the amount of ink produced is more than necessary. In this case, the rest of the ink cannot be reused and is discarded, adversely affecting the company's productivity and environment. Nowadays, machine learning models can be used to figure out this problem. This study compares the ink usage prediction machine learning models. A simple linear regression model, Multiple Regression Analysis, cannot reflect the nonlinear relationship between the variables required for packaging printing, so there is a limit to accurately predicting the amount of ink needed. This study has established various prediction models which are based on CART (Classification and Regression Tree), such as Decision Tree, Random Forest, Gradient Boosting Machine, and XGBoost. The accuracy of the models is determined by the K-fold cross-validation. Error metrics such as root mean squared error, mean absolute error, and R-squared are employed to evaluate estimation models' correctness. Among these models, XGBoost model has the highest prediction accuracy and can reduce 2134 (g) of wasted ink for each work. Thus, this study motivates machine learning's potential to help advance productivity and protect the environment.

A Study on the Prediction of Learning Results Using Machine Learning (기계학습을 활용한 대학생 학습결과 예측 연구)

  • Kim, Yeon-Hee;Lim, Soo-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.695-704
    • /
    • 2020
  • Recently, There has been an increasing of utilization IT, and studies have been conducted on predicting learning results. In this study, Learning activity data were collected that could affect learning outcomes by using learning analysis. The survey was conducted at a university in South Chung-Cheong Province from October to December 2018, with 1,062 students taking part in the survey. First, A Hierarchical regression analysis was conducted by organizing a model of individual, academic, and behavioral factors for learning results to ensure the validity of predictors in machine learning. The model of hierarchical regression was significant, and the explanatory power (R2) was shown to increase step by step, so the variables injected were appropriate. In addition, The linear regression analysis method of machine learning was used to determine how predictable learning outcomes are, and its error rate was collected at about 8.4%.

Deep LS-SVM for regression

  • Hwang, Changha;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.827-833
    • /
    • 2016
  • In this paper, we propose a deep least squares support vector machine (LS-SVM) for regression problems, which consists of the input layer and the hidden layer. In the hidden layer, LS-SVMs are trained with the original input variables and the perturbed responses. For the final output, the main LS-SVM is trained with the outputs from LS-SVMs of the hidden layer as input variables and the original responses. In contrast to the multilayer neural network (MNN), LS-SVMs in the deep LS-SVM are trained to minimize the penalized objective function. Thus, the learning dynamics of the deep LS-SVM are entirely different from MNN in which all weights and biases are trained to minimize one final error function. When compared to MNN approaches, the deep LS-SVM does not make use of any combination weights, but trains all LS-SVMs in the architecture. Experimental results from real datasets illustrate that the deep LS-SVM significantly outperforms state of the art machine learning methods on regression problems.

Selecting the Optimal Hidden Layer of Extreme Learning Machine Using Multiple Kernel Learning

  • Zhao, Wentao;Li, Pan;Liu, Qiang;Liu, Dan;Liu, Xinwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.5765-5781
    • /
    • 2018
  • Extreme learning machine (ELM) is emerging as a powerful machine learning method in a variety of application scenarios due to its promising advantages of high accuracy, fast learning speed and easy of implementation. However, how to select the optimal hidden layer of ELM is still an open question in the ELM community. Basically, the number of hidden layer nodes is a sensitive hyperparameter that significantly affects the performance of ELM. To address this challenging problem, we propose to adopt multiple kernel learning (MKL) to design a multi-hidden-layer-kernel ELM (MHLK-ELM). Specifically, we first integrate kernel functions with random feature mapping of ELM to design a hidden-layer-kernel ELM (HLK-ELM), which serves as the base of MHLK-ELM. Then, we utilize the MKL method to propose two versions of MHLK-ELMs, called sparse and non-sparse MHLK-ELMs. Both two types of MHLK-ELMs can effectively find out the optimal linear combination of multiple HLK-ELMs for different classification and regression problems. Experimental results on seven data sets, among which three data sets are relevant to classification and four ones are relevant to regression, demonstrate that the proposed MHLK-ELM achieves superior performance compared with conventional ELM and basic HLK-ELM.

Development of a Model to Predict the Number of Visitors to Local Festivals Using Machine Learning (머신러닝을 활용한 지역축제 방문객 수 예측모형 개발)

  • Lee, In-Ji;Yoon, Hyun Shik
    • The Journal of Information Systems
    • /
    • v.29 no.3
    • /
    • pp.35-52
    • /
    • 2020
  • Purpose Local governments in each region actively hold local festivals for the purpose of promoting the region and revitalizing the local economy. Existing studies related to local festivals have been actively conducted in tourism and related academic fields. Empirical studies to understand the effects of latent variables on local festivals and studies to analyze the regional economic impacts of festivals occupy a large proportion. Despite of practical need, since few researches have been conducted to predict the number of visitors, one of the criteria for evaluating the performance of local festivals, this study developed a model for predicting the number of visitors through various observed variables using a machine learning algorithm and derived its implications. Design/methodology/approach For a total of 593 festivals held in 2018, 6 variables related to the region considering population size, administrative division, and accessibility, and 15 variables related to the festival such as the degree of publicity and word of mouth, invitation singer, weather and budget were set for the training data in machine learning algorithm. Since the number of visitors is a continuous numerical data, random forest, Adaboost, and linear regression that can perform regression analysis among the machine learning algorithms were used. Findings This study confirmed that a prediction of the number of visitors to local festivals is possible using a machine learning algorithm, and the possibility of using machine learning in research in the tourism and related academic fields, including the study of local festivals, was captured. From a practical point of view, the model developed in this study is used to predict the number of visitors to the festival to be held in the future, so that the festival can be evaluated in advance and the demand for related facilities, etc. can be utilized. In addition, the RReliefF rank result can be used. Considering this, it will be possible to improve the existing local festivals or refer to the planning of a new festival.