• Title/Summary/Keyword: Data prediction model

Search Result 5,422, Processing Time 0.038 seconds

Comparative Study on the Accuracy of Surface Air Temperature Prediction based on selection of land use and initial meteorological data (토지이용도와 초기 기상 입력 자료의 선택에 따른 지상 기온 예측 정확도 비교 연구)

  • Hae-Dong Kim;Ha-Young Kim
    • Journal of Environmental Science International
    • /
    • v.33 no.6
    • /
    • pp.435-442
    • /
    • 2024
  • We investigated the accuracy of surface air temperature prediction according to the selection of land-use data and initial meteorological data using the Weather Research and Forecasting model-v4.2.1. A numerical experiment was conducted at the Daegu Dyeing Industrial Complex. We initially used meteorological input data from GFS (Global forecast system)and GDAPS (Global data assimilation and prediction system). High-resolution input data were generated and used as input data for the weather model using the land cover data of the Ministry of Environment and the digital elevation model of the Ministry of Land, Infrastructure, and Transport. The experiment was conducted by classifying the terrestrial and topographic data (land cover data) and meteorological data applied to the model. For simulations using high-resolution terrestrial data(10 m), global data assimilation, and prediction system data(CASE 3), the calculated surface temperature was much closer to the automatic weather station observations than for simulations using low-resolution terrestrial data(900 m) and GFS(CASE 1).

Development of a Medial Care Cost Prediction Model for Cancer Patients Using Case-Based Reasoning (사례기반 추론을 이용한 암 환자 진료비 예측 모형의 개발)

  • Chung, Suk-Hoon;Suh, Yong-Moo
    • Asia pacific journal of information systems
    • /
    • v.16 no.2
    • /
    • pp.69-84
    • /
    • 2006
  • Importance of Today's diffusion of integrated hospital information systems is that various and huge amount of data is being accumulated in their database systems. Many researchers have studied utilizing such hospital data. While most researches were conducted mainly for medical diagnosis, there have been insufficient studies to develop medical care cost prediction model, especially using machine learning techniques. In this research, therefore, we built a medical care cost prediction model for cancer patients using CBR (Case-Based Reasoning), one of the machine learning techniques. Its performance was compared with those of Neural Networks and Decision Tree models. As a result of the experiment, the CBR prediction model was shown to be the best in general with respect to error rate and linearity between real values and predicted values. It is believed that the medical care cost prediction model can be utilized for the effective management of limited resources in hospitals.

A Study on the Development of University Students Dropout Prediction Model Using Ensemble Technique (앙상블 기법을 활용한 대학생 중도탈락 예측 모형 개발)

  • Park, Sangsung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.17 no.1
    • /
    • pp.109-115
    • /
    • 2021
  • The number of freshmen at universities is decreasing due to the recent decline in the school-age population, and the survival of many universities is threatened. To overcome this situation, universities are seeking ways to use big data within the school to improve the quality of education. A study on the prediction of dropout students is a representative case of using big data in universities. The dropout prediction can prepare a systematic management plan by identifying students who will drop out of school due to reasons such as dropout or expulsion. In the case of actual on-campus data, a large number of missing values are included because it is collected and managed by various departments. For this reason, it is necessary to construct a model by effectively reflecting the missing values. In this study, we propose a university student dropout prediction model based on eXtreme Gradient Boost that can be applied to data with many missing values and shows high performance. In order to examine the practical applicability of the proposed model, an experiment was performed using data from C University in Chungbuk. As a result of the experiment, the prediction performance of the proposed model was found to be excellent. The management strategy of dropout students can be established through the prediction results of the model proposed in this paper.

A Neural Network Model for Bankruptcy Prediction -Domestic KSE listed Bankrupted Companies after the foreign exchange crisis in 1997 (인공신경망을 이용한 기업도산 예측 - IMF후 국내 상장회사를 중심으로 -)

  • Jeong Yu-Seok;Lee Hyun-Soo;Chae Young-Il;Suh Yung-Ho
    • Proceedings of the Korean Society for Quality Management Conference
    • /
    • 2004.04a
    • /
    • pp.655-673
    • /
    • 2004
  • This paper is concerned with analysing the bankruptcy prediction power of three models: Multivariate Discriminant Analysis(MDA ), Logit Analysis, Neural Network. The after-crisis bankrupted companies were limited to the research data and the listed companies belonging to manufacturing industry was limited to the research data so as to improve prediction accuracy and validity of the model. In order to assure meaningful bankruptcy prediction, training data and testing data were not extracted within the corresponding period. The result is that prediction accuracy of neural network model is more excellent than that of logit analysis and MDA model when considering that execution of testing data was followed by execution of training data.

  • PDF

Basic Study on Safety Accident Prediction Model Using Random Forest in Construction Field (랜덤 포레스트 기법을 이용한 건설현장 안전재해 예측 모형 기초 연구)

  • Kang, Kyung-Su;Ryu, Han-Guk
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2018.11a
    • /
    • pp.59-60
    • /
    • 2018
  • The purpose of this study is to predict and classify the accident types based on the KOSHA (Korea Occupational Safety & Health Agency) and weather data. We also have an effort to suggest an important management method according to accident types by deriving feature importance. We designed two models based on accident data and weather data (model(a)) and only weather data (model(b)). As a result of random forest method, the model(b) showed a lack of accuracy in prediction. However, the model(a) presented more accurate prediction results than the model(b). Thus we presented safety management plan based on the results. In the future, this study will continue to carry out real time prediction to occurrence types to prevent safety accidents by supplementing the real time accident data and weather data.

  • PDF

Genetic-fuzzy approach to model concrete shrinkage

  • da Silva, Wilson Ricardo Leal;Stemberk, Petr
    • Computers and Concrete
    • /
    • v.12 no.2
    • /
    • pp.109-129
    • /
    • 2013
  • This work presents an approach to model concrete shrinkage. The goal is to permit the concrete industry's experts to develop independent prediction models based on a reduced number of experimental data. The proposed approach combines fuzzy logic and genetic algorithm to optimize the fuzzy decision-making, thereby reducing data collection time. Such an approach was implemented for an experimental data set related to self-compacting concrete. The obtained prediction model was compared against published experimental data (not used in model development) and well-known shrinkage prediction models. The predicted results were verified by statistical analysis, which confirmed the reliability of the developed model. Although the range of application of the developed model is limited, the genetic-fuzzy approach introduced in this work proved suitable for adjusting the prediction model once additional training data are provided. This can be highly inviting for the concrete industry's experts, since they would be able to fine-tune their models depending on the boundary conditions of their production processes.

LSTM Model-based Prediction of the Variations in Load Power Data from Industrial Manufacturing Machines

  • Rita, Rijayanti;Kyohong, Jin;Mintae, Hwang
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.4
    • /
    • pp.295-302
    • /
    • 2022
  • This paper contains the development of a smart power device designed to collect load power data from industrial manufacturing machines, predict future variations in load power data, and detect abnormal data in advance by applying a machine learning-based prediction algorithm. The proposed load power data prediction model is implemented using a Long Short-Term Memory (LSTM) algorithm with high accuracy and relatively low complexity. The Flask and REST API are used to provide prediction results to users in a graphical interface. In addition, we present the results of experiments conducted to evaluate the performance of the proposed approach, which show that our model exhibited the highest accuracy compared with Multilayer Perceptron (MLP), Random Forest (RF), and Support Vector Machine (SVM) models. Moreover, we expect our method's accuracy could be improved by further optimizing the hyperparameter values and training the model for a longer period of time using a larger amount of data.

A study on Estimation of NO2 concentration by Statistical model (통계모형을 이용한 NO2 농도 예측에 관한 연구)

  • Jang Nan-Sim
    • Journal of Environmental Science International
    • /
    • v.14 no.11
    • /
    • pp.1049-1056
    • /
    • 2005
  • [ $NO_2$ ] concentration characteristics of Busan metropolitan city was analysed by statistical method using hourly $NO_2$ concentration data$(1998\~2000)$ collected from air quality monitoring sites of the metropolitan city. 4 representative regions were selected among air quality monitoring sites of Ministry of environment. Concentration data of $NO_2$, 5 air pollutants, and data collected at AWS was used. Both Stepwise Multiple Regression model and ARIMA model for prediction of $NO_2$ concentrations were adopted, and then their results were compared with observed concentration. While ARIMA model was useful for the prediction of daily variation of the concentration, it was not satisfactory for the prediction of both rapid variation and seasonal variation of the concentration. Multiple Regression model was better estimated than ARIMA model for prediction of $NO_2$ concentration.

Analysis Model Evaluation based on IoT Data and Machine Learning Algorithm for Prediction of Acer Mono Sap Liquid Water

  • Lee, Han Sung;Jung, Se Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.10
    • /
    • pp.1286-1295
    • /
    • 2020
  • It has been increasingly difficult to predict the amounts of Acer mono sap to be collected due to droughts and cold waves caused by recent climate changes with few studies conducted on the prediction of its collection volume. This study thus set out to propose a Big Data prediction system based on meteorological information for the collection of Acer mono sap. The proposed system would analyze collected data and provide managers with a statistical chart of prediction values regarding climate factors to affect the amounts of Acer mono sap to be collected, thus enabling efficient work. It was designed based on Hadoop for data collection, treatment and analysis. The study also analyzed and proposed an optimal prediction model for climate conditions to influence the volume of Acer mono sap to be collected by applying a multiple regression analysis model based on Hadoop and Mahout.

Using Machine Learning Algorithms for Housing Price Prediction: The Case of Islamabad Housing Data

  • Imran, Imran;Zaman, Umar;Waqar, Muhammad;Zaman, Atif
    • Soft Computing and Machine Intelligence
    • /
    • v.1 no.1
    • /
    • pp.11-23
    • /
    • 2021
  • House price prediction is a significant financial decision for individuals working in the housing market as well as for potential buyers. From investment to buying a house for residence, a person investing in the housing market is interested in the potential gain. This paper presents machine learning algorithms to develop intelligent regressions models for House price prediction. The proposed research methodology consists of four stages, namely Data Collection, Pre Processing the data collected and transforming it to the best format, developing intelligent models using machine learning algorithms, training, testing, and validating the model on house prices of the housing market in the Capital, Islamabad. The data used for model validation and testing is the asking price from online property stores, which provide a reasonable estimate of the city housing market. The prediction model can significantly assist in the prediction of future housing prices in Pakistan. The regression results are encouraging and give promising directions for future prediction work on the collected dataset.