• Title/Summary/Keyword: Linear prediction analysis

Search Result 865, Processing Time 0.028 seconds

Machine learning-based Fine Dust Prediction Model using Meteorological data and Fine Dust data (기상 데이터와 미세먼지 데이터를 활용한 머신러닝 기반 미세먼지 예측 모형)

  • KIM, Hye-Lim;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.1
    • /
    • pp.92-111
    • /
    • 2021
  • As fine dust negatively affects disease, industry and economy, the people are sensitive to fine dust. Therefore, if the occurrence of fine dust can be predicted, countermeasures can be prepared in advance, which can be helpful for life and economy. Fine dust is affected by the weather and the degree of concentration of fine dust emission sources. The industrial sector has the largest amount of fine dust emissions, and in industrial complexes, factories emit a lot of fine dust as fine dust emission sources. This study targets regions with old industrial complexes in local cities. The purpose of this study is to explore the factors that cause fine dust and develop a predictive model that can predict the occurrence of fine dust. weather data and fine dust data were used, and variables that influence the generation of fine dust were extracted through multiple regression analysis. Based on the results of multiple regression analysis, a model with high predictive power was extracted by learning with a machine learning regression learner model. The performance of the model was confirmed using test data. As a result, the models with high predictive power were linear regression model, Gaussian process regression model, and support vector machine. The proportion of training data and predictive power were not proportional. In addition, the average value of the difference between the predicted value and the measured value was not large, but when the measured value was high, the predictive power was decreased. The results of this study can be developed as a more systematic and precise fine dust prediction service by combining meteorological data and urban big data through local government data hubs. Lastly, it will be an opportunity to promote the development of smart industrial complexes.

A Study on the Self-Propulsion CFD Analysis for a Catamaran with Asymmetrical Inside and Outside Hull Form (안팎 형상이 비대칭인 쌍동선의 자항성능 CFD 해석에 관한 연구)

  • Jonghyeon Lee;Dong-Woo Park
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.30 no.1
    • /
    • pp.108-117
    • /
    • 2024
  • In this study, simulations based on computational fluid dynamics were performed for self-propulsion performance prediction of a catamaran that has asymmetrical inside and outside hull form and numerous knuckle lines. In the simulations, the Moving Reference Frame (MRF) or Sliding Mesh (SDM) techniques were used, and the rotation angle of the propeller per time step was different to identify the difference using the analysis technique and condition. The propeller rotation angle used in the MRF technique was 1˚ and those used in the SDM technique were 1˚, 5˚, or 10˚. The torque of the propeller was similar in both the techniques; however, the thrust and resistance of the hull were computed lower when the SDM technique was applied than when the MRF technique was applied, and higher as the rotation angle of the propeller per time step in the SDM technique was smaller in the simulations for several revolutions of the propeller to estimate the self-propulsion condition. The revolutions, thrust, and torque of the propeller in the self-propulsion condition obtained using linear interpolation and the delivered power, wake fraction, thrust deduction factor, and revolutions of the propeller obtained using the full-scale prediction method showed the same trend for both the techniques; however, most of the self-propulsion efficiency showed the opposite trend for these techniques. The accuracy of the propeller wake was low in the simulations when the MRF technique was applied, and slight difference existed in the expression of the wake according to the rotation angle of the propeller per time step when the SDM technique was applied.

Estimation of Fresh Weight and Leaf Area Index of Soybean (Glycine max) Using Multi-year Spectral Data (다년도 분광 데이터를 이용한 콩의 생체중, 엽면적 지수 추정)

  • Jang, Si-Hyeong;Ryu, Chan-Seok;Kang, Ye-Seong;Park, Jun-Woo;Kim, Tae-Yang;Kang, Kyung-Suk;Park, Min-Jun;Baek, Hyun-Chan;Park, Yu-hyeon;Kang, Dong-woo;Zou, Kunyan;Kim, Min-Cheol;Kwon, Yeon-Ju;Han, Seung-ah;Jun, Tae-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.4
    • /
    • pp.329-339
    • /
    • 2021
  • Soybeans (Glycine max), one of major upland crops, require precise management of environmental conditions, such as temperature, water, and soil, during cultivation since they are sensitive to environmental changes. Application of spectral technologies that measure the physiological state of crops remotely has great potential for improving quality and productivity of the soybean by estimating yields, physiological stresses, and diseases. In this study, we developed and validated a soybean growth prediction model using multispectral imagery. We conducted a linear regression analysis between vegetation indices and soybean growth data (fresh weight and LAI) obtained at Miryang fields. The linear regression model was validated at Goesan fields. It was found that the model based on green ratio vegetation index (GRVI) had the greatest performance in prediction of fresh weight at the calibration stage (R2=0.74, RMSE=246 g/m2, RE=34.2%). In the validation stage, RMSE and RE of the model were 392 g/m2 and 32%, respectively. The errors of the model differed by cropping system, For example, RMSE and RE of model in single crop fields were 315 g/m2 and 26%, respectively. On the other hand, the model had greater values of RMSE (381 g/m2) and RE (31%) in double crop fields. As a result of developing models for predicting a fresh weight into two years (2018+2020) with similar accumulated temperature (AT) in three years and a single year (2019) that was different from that AT, the prediction performance of a single year model was better than a two years model. Consequently, compared with those models divided by AT and a three years model, RMSE of a single crop fields were improved by about 29.1%. However, those of double crop fields decreased by about 19.6%. When environmental factors are used along with, spectral data, the reliability of soybean growth prediction can be achieved various environmental conditions.

Fundamental Investigation of Non-invasive Determination of Alcohol in Blood by Near Infrared Spectrophotometry (근적외선 분광분석법을 이용한 음주측정기술 개발에 관한 연구)

  • Chang, Soo-Hyun;Cho, Chang-Hee;Woo, Young-Ah;Kim, Hyo-Jin;Kim, Young-Man;Lee, Kang-Boong;Kim, Young-Woon;Park, Sung-Woo
    • Analytical Science and Technology
    • /
    • v.12 no.5
    • /
    • pp.375-381
    • /
    • 1999
  • Near infrared spectrophotometry(NIR) was developed as a non-invasive determination of blood alcohol. The first pure alcohol/water samples were prepared with ethanol concentration from 0.01 to 0.1%(w/w). Analysis of the second-derivative data was accomplished with multilinear regression(MLR). The standard error of calibration(SEC) of ethanol in ethanol/water solutions was approximately 0.0039%. The calibration models were established from the blood alcohol spectra by MLR and PLSR analysis. The best calibration was built with the second-derivative spectra of 2266 and 2326 nm by MLR. Second-derivative spectra in the spectral ranges of 1100~1340, 1500~1796 and 2064~2300 nm with four PLSR factors provided the standard error of prediction(SEP) of 0.030%(w/w). These results indicate that NIR may be applied for a fast non-invasive determination of alcohol in the blood.

  • PDF

Quantification of Chloride Diffusivity in Steady State Condition in Concrete with Fly Ash Considering Curing and Crack Effect (재령 및 균열효과를 고려한 플라이애시 콘크리트의 정상상태 염화물 확산 특성의 정량화)

  • Yoon, Yong-Sik;Cheon, Ju-Hyun;Kwon, Seung-Jun
    • Journal of the Korean Recycled Construction Resources Institute
    • /
    • v.7 no.2
    • /
    • pp.109-115
    • /
    • 2019
  • In case of the cracks in concrete, the penetration of deterioration ions such as chloride ions in to cracks is accelerated. According to the penetration of chloride ions, structural and durability problems to RC(Reinforced Concrete) structures are caused. In this study, the accelerated chloride diffusion coefficient which is in steady state is evaluated for 2 year aged normal and high strength FA(Fly Ash) concrete, after a range of crack depths are induced up to 1.0 mm in 56 aged day. Considering crack effect by linear regression analysis, high strength concrete has slightly less increasing ratio of diffusion coefficient by crack than normal strength concrete, and diffusion coefficient increases non-linearly as crack width is increased. Also, In two types of concrete, crack effect decrease as the curing period increase. In the case of quantifying crack and curing effect by using exponential function form, the coefficients of determination are higher than those of linear regression analysis. Under steady state, it is thought that there is not a high correlation between the crack effect and the curing effect, and considering the two independent effects, it is believed that reasonable prediction equation for diffusion of concrete with crack can be proposed.

Youtube Mukbang and Online Delivery Orders: Analysis of Impacts and Predictive Model (유튜브 먹방과 온라인 배달 주문: 영향력 분석과 예측 모형)

  • Choi, Sarah;Lee, Sang-Yong Tom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.119-133
    • /
    • 2022
  • One of the most important current features of food related industry is the growth of food delivery service. Another notable food related culture is, with the advent of Youtube, the popularity of Mukbang, which refers to content that records eating. Based on these background, this study intended to focus on two things. First, we tried to see the impact of Youtube Mukbang and the sentiments of Mukbang comments on the number of related food deliveries. Next, we tried to set up the predictive modeling of chicken delivery order with machine learning method. We used Youtube Mukbang comments data as well as weather related data as main independent variables. The dependent variable used in this study is the number of delivery order of fried chicken. The period of data used in this study is from June 3, 2015 to September 30, 2019, and a total of 1,580 data were used. For the predictive modeling, we used machine learning methods such as linear regression, ridge, lasso, random forest, and gradient boost. We found that the sentiment of Youtube Mukbang and comments have impacts on the number of delivery orders. The prediction model with Mukban data we set up in this study had better performances than the existing models without Mukbang data. We also tried to suggest managerial implications to the food delivery service industry.

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.

Development of Prediction Equation of Diffusing Capacity of Lung for Koreans

  • Hwang, Yong Il;Park, Yong Bum;Yoon, Hyoung Kyu;Lim, Seong Yong;Kim, Tae-Hyung;Park, Joo Hun;Lee, Won-Yeon;Park, Seong Ju;Lee, Sei Won;Kim, Woo Jin;Kim, Ki Uk;Shin, Kyeong Cheol;Kim, Do Jin;Kim, Hui Jung;Kim, Tae-Eun;Yoo, Kwang Ha;Shim, Jae Jeong
    • Tuberculosis and Respiratory Diseases
    • /
    • v.81 no.1
    • /
    • pp.42-48
    • /
    • 2018
  • Background: The diffusing capacity of the lung is influenced by multiple factors such as age, sex, height, weight, ethnicity and smoking status. Although a prediction equation for the diffusing capacity of Korea was proposed in the mid-1980s, this equation is not used currently. The aim of this study was to develop a new prediction equation for the diffusing capacity for Koreans. Methods: Using the data of the Korean National Health and Nutrition Examination Survey, a total of 140 nonsmokers with normal chest X-rays were enrolled in this study. Results: Using linear regression analysis, a new predicting equation for diffusing capacity was developed. For men, the following new equations were developed: carbon monoxide diffusing capacity (DLco)=-10.4433-0.1434${\times}$age (year)+0.2482${\times}$heights (cm); DLco/alveolar volume (VA)=6.01507-0.02374${\times}$age (year)-0.00233${\times}$heights (cm). For women the prediction equations were described as followed: DLco=-12.8895-0.0532${\times}$age (year)+0.2145${\times}$heights (cm) and DLco/VA=7.69516-0.02219${\times}$age (year)-0.01377${\times}$heights (cm). All equations were internally validated by k-fold cross validation method. Conclusion: In this study, we developed new prediction equations for the diffusing capacity of the lungs of Koreans. A further study is needed to validate the new predicting equation for diffusing capacity.

Regional Frequency Analysis for Rainfall using L-Moment (L-모멘트법에 의한 강우의 지역빈도분석)

  • Koh, Deuk-Koo;Choo, Tai-Ho;Maeng, Seung-Jin;Trivedi, Chanda
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.3
    • /
    • pp.252-263
    • /
    • 2008
  • This study was conducted to derive the optimal regionalization of the precipitation data which can be classified on the basis of climatologically and geographically homogeneous regions all over the regions except Cheju and Ulreung islands in Korea. A total of 65 rain gauges were used to regional analysis of precipitation. Annual maximum series for the consecutive durations of 1, 3, 6, 12, 24, 36, 48 and 72hr were used for various statistical analyses. K-means clustering mettled is used to identify homogeneous regions all over the regions. Five homogeneous regions for the precipitation were classified by the K-means clustering. Using the L-moment ratios and Kolmogorov-Smirnov test, the underlying regional probability distribution was identified to be the generalized extreme value (GEV) distribution among applied distributions. The regional and at-site parameters of the generalized extreme value distribution were estimated by the linear combination of the probability weighted moments, L-moment. The regional and at-site analysis for the design rainfall were tested by Monte Carlo simulation. Relative root-mean-square error (RRMSE), relative bias (RBIAS) and relative reduction (RR) in RRMSE were computed and compared with those resulting from at-site Monte Carlo simulation. All show that the regional analysis procedure can substantially reduce the RRMSE, RBIAS and RR in RRMSE in the prediction of design rainfall. Consequently, optimal design rainfalls following the regions and consecutive durations were derived by the regional frequency analysis.

Non-stationary Rainfall Frequency Analysis Based on Residual Analysis (잔차시계열 분석을 통한 비정상성 강우빈도해석)

  • Jang, Sun-Woo;Seo, Lynn;Kim, Tae-Woong;Ahn, Jae-Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.31 no.5B
    • /
    • pp.449-457
    • /
    • 2011
  • Recently, increasing heavy rainfalls due to climate change and/or variability result in hydro-climatic disasters being accelerated. To cope with the extreme rainfall events in the future, hydrologic frequency analysis is usually used to estimate design rainfalls in a design target year. The rainfall data series applied to the hydrologic frequency analysis is assumed to be stationary. However, recent observations indicate that the data series might not preserve the statistical properties of rainfall in the future. This study incorporated the residual analysis and the hydrologic frequency analysis to estimate design rainfalls in a design target year considering the non-stationarity of rainfall. The residual time series were generated using a linear regression line constructed from the observations. After finding the proper probability density function for the residuals, considering the increasing or decreasing trend, rainfalls quantiles were estimated corresponding to specific design return periods in a design target year. The results from applying the method to 14 gauging stations indicate that the proposed method provides appropriate design rainfalls and reduces the prediction errors compared with the conventional rainfall frequency analysis which assumes that the rainfall data are stationary.