• Title/Summary/Keyword: absolute model accuracy

Search Result 261, Processing Time 0.024 seconds

Predicting the Popularity of Post Articles with Virtual Temperature in Web Bulletin (웹게시판에서 가상온도를 이용한 게시글의 인기 예측)

  • Kim, Su-Do;Kim, So-Ra;Cho, Hwan-Gue
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.10
    • /
    • pp.19-29
    • /
    • 2011
  • A Blog provides commentary, news, or content on a particular subject. The important part of many blogs is interactive format. Sometimes, there is a heated debate on a topic and any article becomes a political or sociological issue. In this paper, we proposed a method to predict the popularity of an article in advance. First, we used hit count as a factor to predict the popularity of an article. We defined the saturation point and derived a model to predict the hit count of the saturation point by a correlation coefficient of the early hit count and hit count of the saturation point. Finally, we predicted the virtual temperature of an article using 4 types(explosive, hot, warm, cold). We can predict the virtual temperature of Internet discussion articles using the hit count of the saturation point with more than 70% accuracy, exploiting only the first 30 minutes' hit count. In the hot, warm, and cold categories, we can predict more than 86% accuracy from 30 minutes' hit count and more than 90% accuracy from 70 minutes' hit count.

Prediction Models for Solitary Pulmonary Nodules Based on Curvelet Textural Features and Clinical Parameters

  • Wang, Jing-Jing;Wu, Hai-Feng;Sun, Tao;Li, Xia;Wang, Wei;Tao, Li-Xin;Huo, Da;Lv, Ping-Xin;He, Wen;Guo, Xiu-Hua
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.10
    • /
    • pp.6019-6023
    • /
    • 2013
  • Lung cancer, one of the leading causes of cancer-related deaths, usually appears as solitary pulmonary nodules (SPNs) which are hard to diagnose using the naked eye. In this paper, curvelet-based textural features and clinical parameters are used with three prediction models [a multilevel model, a least absolute shrinkage and selection operator (LASSO) regression method, and a support vector machine (SVM)] to improve the diagnosis of benign and malignant SPNs. Dimensionality reduction of the original curvelet-based textural features was achieved using principal component analysis. In addition, non-conditional logistical regression was used to find clinical predictors among demographic parameters and morphological features. The results showed that, combined with 11 clinical predictors, the accuracy rates using 12 principal components were higher than those using the original curvelet-based textural features. To evaluate the models, 10-fold cross validation and back substitution were applied. The results obtained, respectively, were 0.8549 and 0.9221 for the LASSO method, 0.9443 and 0.9831 for SVM, and 0.8722 and 0.9722 for the multilevel model. All in all, it was found that using curvelet-based textural features after dimensionality reduction and using clinical predictors, the highest accuracy rate was achieved with SVM. The method may be used as an auxiliary tool to differentiate between benign and malignant SPNs in CT images.

Regression model for the preparation of calibration curve in the quantitative LC-MS/MS analysis of urinary methamphetamine, amphetamine and 11-nor-Δ9-tetrahydrocannabinol-9-carboxylic acid using R (소변 중 메트암페타민, 암페타민 및 대마 대사체 LC-MS/MS 정량분석에서 검량선 작성을 위한 R을 활용한 회귀모델 선택)

  • Kim, Jin Young;Shin, Dong Won
    • Analytical Science and Technology
    • /
    • v.34 no.6
    • /
    • pp.241-250
    • /
    • 2021
  • Calibration curves are essential in quantitative methods and for improving the accuracy of analyte measurements in biological samples. In this study, a statistical analysis model built in the R language (The R Foundation for Statistical Computing) was used to identify a set of weighting factors and regression models based on a stepwise selection criteria. An LC-MS/MS method was used to detect the presence of urinary methamphetamine, amphetamine, and 11-nor-9-carboxy-Δ9 -tetrahydrocannabinol in a sample set. Weighting factors for the calibration curves were derived by calculating the heteroscedasticity of the measurements, where the presence of heteroscedasticity was determined via variance tests. The optimal regression model and weighting factor were chosen according to the sum of the absolute percentage relative error. Subsequently, the order of the regression model was calculated using a partial variance test. The proposed statistical analysis tool facilitated selection of the optimal calibration model and detection of methamphetamine, amphetamine, and 11-nor-9-carboxy-Δ9-tetrahydrocannabinol in urine. Thus, this study for the selection of weighting and the use of a complex regression equation may provide insights for linear and quadratic regressions in analytical and bioanalytical measurements.

Fluctuations and Time Series Forecasting of Sea Surface Temperature at Yeosu Coast in Korea (여수연안 표면수온의 변동 특성과 시계열적 예측)

  • Seong, Ki-Tack;Choi, Yang-Ho;Koo, Jun Ho;Jeon, Sang-Back
    • Journal of the Korean Society for Marine Environment & Energy
    • /
    • v.17 no.2
    • /
    • pp.122-130
    • /
    • 2014
  • Seasonal variations and long term linear trends of SST (Sea Surface Temperature) at Yeosu Coast ($127^{\circ}37.73^{\prime}E$, $34^{\circ}37.60^{\prime}N$) in Korea were studied performing the harmonic analysis and the regression analysis of the monthly mean SST data of 46 years (1965-2010) collected by the Fisheries Research and Development Institute in Korea. The mean SST and the amplitude of annual SST variation show $15.6^{\circ}C$ and $9.0^{\circ}C$ respectively. The phase of annual SST variation is $236^{\circ}$. The maximum SST at Yeosu Coast occurs around August 26. Climatic changes in annual mean SST have had significant increasing tendency with increase rate $0.0305^{\circ}C/Year$. The warming trend in recent 30 years (1981-2010) is more pronounced than that in the last 30 years (1966-1995) and the increasing tendency of winter SST dominates that of the annual SST. The time series model that could be used to forecast the SST on a monthly basis was developed applying Box-Jenkins methodology. $ARIMA(1,0,0)(2,1,0)_{12}$ was suggested for forecasting the monthly mean SST at Yeosu Coast in Korea. Mean absolute percentage error to measure the accuracy of forecasted values was 8.3%.

An Experimental Study for the Hydraulic Characteristics of Vertical lift Gates with Sediment Transport (퇴적토 배출을 수반한 연직수문의 수리특성에 관한 실험적 연구)

  • Choi, Seung Jea;Lee, Ji Haeng;Choi, Heung Sik
    • Ecology and Resilient Infrastructure
    • /
    • v.5 no.4
    • /
    • pp.246-256
    • /
    • 2018
  • In order to analyze hydraulic characteristics of discharge coefficient, hydraulic jump height, and hydraulic jump length, accompanied sediment transport, in the under-flow type vertical lift gate, the hydraulic model experiment and dimensional analysis were performed. The correlations between Froude number and hydraulic characteristics were schematized according to the presence and absence of sediment transport; the correlation of hydraulic characteristics and non-dimensional parameters was analyzed and multiple regression formulae were developed. In the hydraulic characteristics accompanied the sediment transport, by identifying the aspect different from the case that the sediment transport is absent, we verified that it is necessary to introduce variables that can express the characteristics of sediment transport. The multiple regression equations were suggested and each determination coefficient appeared high as 0.749 for discharge coefficient, 0.896 for hydraulic jump height, and 0.955 for hydraulic jump length. In order to evaluate the applicability of the developed hydraulic characteristic equations, 95% prediction interval analysis was conducted on the measured and the calculated by regression equations, and it was determined that NSE (Nash-Sutcliffe Efficiency), RMSE (root mean square), and MAPE (mean absolute percentage error) are appropriate, for the accuracy analysis related to the prediction on hydraulic characteristics of discharge coefficient, hydraulic jump height and length.

Estimating Stem Volume Table of Quercus Acutissima in South Korea using Variable Exponent Equation (변량지수식을 이용한 전국 상수리나무의 입목수간재적표 추정)

  • Ko, Chi-Ung;Kim, Dong-Geun;Kang, Jin-Taek
    • Journal of Korean Society of Forest Science
    • /
    • v.108 no.3
    • /
    • pp.357-363
    • /
    • 2019
  • This study was conducted to develop a stem volume table for Quercus acutissima in Korea by using Kozak's stem taper equation. In total, 2700 tree samples were collected around the country, and growth performance was investigated through compiling data on diameters by stem height and stem analysis. In order to test the stem taper equation's fitness, the fitness index (FI), bias, and mean absolute deviation (MAD) were analyzed. The fitness of the equation was estimated at 97%, bias as 0.017, and MAD turned out to be 1.118, respectively. Furthermore, there was a statistically significant volume difference between the current volume table and the new volume table (p = 0.0008, <0.005). The result indicates that using the new volume table that reflects the actual forest will reduce the loss when assessing wood resources and will improve the accuracy of forest statistics for national and local governments. A stem volume table, the main result of this research, which is utilized in the estimated stem taper equation, will provide growth information for Quercus acutissima, one of the main broadleaf species in Korea, and will function as a management indicator for rational forest management.

A Deep Learning Method for Cost-Effective Feed Weight Prediction of Automatic Feeder for Companion Animals (반려동물용 자동 사료급식기의 비용효율적 사료 중량 예측을 위한 딥러닝 방법)

  • Kim, Hoejung;Jeon, Yejin;Yi, Seunghyun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.263-278
    • /
    • 2022
  • With the recent advent of IoT technology, automatic pet feeders are being distributed so that owners can feed their companion animals while they are out. However, due to behaviors of pets, the method of measuring weight, which is important in automatic feeding, can be easily damaged and broken when using the scale. The 3D camera method has disadvantages due to its cost, and the 2D camera method has relatively poor accuracy when compared to 3D camera method. Hence, the purpose of this study is to propose a deep learning approach that can accurately estimate weight while simply using a 2D camera. For this, various convolutional neural networks were used, and among them, the ResNet101-based model showed the best performance: an average absolute error of 3.06 grams and an average absolute ratio error of 3.40%, which could be used commercially in terms of technical and financial viability. The result of this study can be useful for the practitioners to predict the weight of a standardized object such as feed only through an easy 2D image.

Comparison and analysis of data-derived stage prediction models (자료 지향형 수위예측 모형의 비교 분석)

  • Choi, Seung-Yong;Han, Kun-Yeun;Choi, Hyun-Gu
    • Journal of Wetlands Research
    • /
    • v.13 no.3
    • /
    • pp.547-565
    • /
    • 2011
  • Different types of schemes have been used in stage prediction involving conceptual and physical models. Nevertheless, none of these schemes can be considered as a single superior model. To overcome disadvantages of existing physics based rainfall-runoff models for stage predicting because of the complexity of the hydrological process, recently the data-derived models has been widely adopted for predicting flood stage. The objective of this study is to evaluate model performance for stage prediction of the Neuro-Fuzzy and regression analysis stage prediction models in these data-derived methods. The proposed models are applied to the Wangsukcheon in Han river watershed. To evaluate the performance of the proposed models, fours statistical indices were used, namely; Root mean square error(RMSE), Nash Sutcliffe efficiency coefficient(NSEC), mean absolute error(MAE), adjusted coefficient of determination($R^{*2}$). The results show that the Neuro-Fuzzy stage prediction model can carry out the river flood stage prediction more accurately than the regression analysis stage prediction model. This study can greatly contribute to the construction of a high accuracy flood information system that secure lead time in medium and small streams.

Forecasts of the BDI in 2010 -Using the ARIMA-Type Models and HP Filtering (2010년 BDI의 예측 -ARIMA모형과 HP기법을 이용하여)

  • Mo, Soo-Won
    • Journal of Korea Port Economic Association
    • /
    • v.26 no.1
    • /
    • pp.222-233
    • /
    • 2010
  • This paper aims at predicting the BDI from Jan. to Dec. 2010 using such econometric techniues of the univariate time series as stochastic ARIMA-type models and Hodrick-Prescott filtering technique. The multivariate cause-effect econometric model is not employed for not assuring a higher degree of forecasting accuracy than the univariate variable model. Such a cause-effect econometric model also fails in adjusting itself for the post-sample. This article introduces the two ARIMA models and five Intervention-ARIMA models. The monthly data cover the period January 2000 through December 2009. The out-of-sample forecasting performance is compared between the ARIMA-type models and the random walk model. Forecasting performance is measured by three summary statistics: root mean squared error (RMSE), mean absolute error (MAE) and mean error (ME). The RMSE and MAE indicate that the ARIMA-type models outperform the random walk model And the mean errors for all models are small in magnitude relative to the MAE's, indicating that all models don't have a tendency of overpredicting or underpredicting systematically in forecasting. The pessimistic ex-ante forecasts are expected to be 2,820 at the end of 2010 compared with the optimistic forecasts of 4,230.

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.