• Title/Summary/Keyword: Predictive Validation

Search Result 253, Processing Time 0.024 seconds

In-depth exploration of machine learning algorithms for predicting sidewall displacement in underground caverns

  • Hanan Samadi;Abed Alanazi;Sabih Hashim Muhodir;Shtwai Alsubai;Abdullah Alqahtani;Mehrez Marzougui
    • Geomechanics and Engineering
    • /
    • v.37 no.4
    • /
    • pp.307-321
    • /
    • 2024
  • This paper delves into the critical assessment of predicting sidewall displacement in underground caverns through the application of nine distinct machine learning techniques. The accurate prediction of sidewall displacement is essential for ensuring the structural safety and stability of underground caverns, which are prone to various geological challenges. The dataset utilized in this study comprises a total of 310 data points, each containing 13 relevant parameters extracted from 10 underground cavern projects located in Iran and other regions. To facilitate a comprehensive evaluation, the dataset is evenly divided into training and testing subset. The study employs a diverse array of machine learning models, including recurrent neural network, back-propagation neural network, K-nearest neighbors, normalized and ordinary radial basis function, support vector machine, weight estimation, feed-forward stepwise regression, and fuzzy inference system. These models are leveraged to develop predictive models that can accurately forecast sidewall displacement in underground caverns. The training phase involves utilizing 80% of the dataset (248 data points) to train the models, while the remaining 20% (62 data points) are used for testing and validation purposes. The findings of the study highlight the back-propagation neural network (BPNN) model as the most effective in providing accurate predictions. The BPNN model demonstrates a remarkably high correlation coefficient (R2 = 0.99) and a low error rate (RMSE = 4.27E-05), indicating its superior performance in predicting sidewall displacement in underground caverns. This research contributes valuable insights into the application of machine learning techniques for enhancing the safety and stability of underground structures.

Application of near-infrared spectroscopy for hay evaluation at different degrees of sample preparation

  • Eun Chan Jeong;Kun Jun Han;Farhad Ahmadi;Yan Fen Li;Li Li Wang;Young Sang Yu;Jong Geun Kim
    • Animal Bioscience
    • /
    • v.37 no.7
    • /
    • pp.1196-1203
    • /
    • 2024
  • Objective: A study was conducted to quantify the performance differences of the near-infrared spectroscopy (NIRS) calibration models developed with different degrees of hay sample preparations. Methods: A total of 227 imported alfalfa (Medicago sativa L.) and another 360 imported timothy (Phleum pratense L.) hay samples were used to develop calibration models for nutrient value parameters such as moisture, neutral detergent fiber, acid detergent fiber, crude protein, and in vitro dry matter digestibility. Spectral data of hay samples prepared by milling into 1-mm particle size or unground were separately regressed against the wet chemistry results of the abovementioned parameters. Results: The performance of the developed NIRS calibration models was evaluated based on R2, standard error, and ratio percentage deviation (RPD). The models developed with ground hay were more robust and accurate than those with unground hay based on calibration model performance indexes such as R2 (coefficient of determination), standard error, and RPD. Although the R2 of calibration models was mainly greater than 0.90 across the feed value indexes, the R2 of cross-validations was much lower. The R2 of cross-validation varies depending on feed value indexes, which ranged from 0.61 to 0.81 in alfalfa, and from 0.62 to 0.95 in timothy. Estimation of feed values in imported hay can be achievable by the calibrated NIRS. However, the NIRS calibration models must be improved by including a broader range of imported hay samples in the modeling. Conclusion: Although the analysis accuracy of NIRS was substantially higher when calibration models were developed with ground samples, less sample preparation will be more advantageous for achieving rapid delivery of hay sample analysis results. Therefore, further research warrants investigating the level of sample preparations compromising analysis accuracy by NIRS.

Deep learning-based AI constitutive modeling for sandstone and mudstone under cyclic loading conditions

  • Luyuan Wu;Meng Li;Jianwei Zhang;Zifa Wang;Xiaohui Yang;Hanliang Bian
    • Geomechanics and Engineering
    • /
    • v.37 no.1
    • /
    • pp.49-64
    • /
    • 2024
  • Rocks undergoing repeated loading and unloading over an extended period, such as due to earthquakes, human excavation, and blasting, may result in the gradual accumulation of stress and deformation within the rock mass, eventually reaching an unstable state. In this study, a CNN-CCM is proposed to address the mechanical behavior. The structure and hyperparameters of CNN-CCM include Conv2D layers × 5; Max pooling2D layers × 4; Dense layers × 4; learning rate=0.001; Epoch=50; Batch size=64; Dropout=0.5. Training and validation data for deep learning include 71 rock samples and 122,152 data points. The AI Rock Constitutive Model learned by CNN-CCM can predict strain values(ε1) using Mass (M), Axial stress (σ1), Density (ρ), Cyclic number (N), Confining pressure (σ3), and Young's modulus (E). Five evaluation indicators R2, MAPE, RMSE, MSE, and MAE yield respective values of 0.929, 16.44%, 0.954, 0.913, and 0.542, illustrating good predictive performance and generalization ability of model. Finally, interpreting the AI Rock Constitutive Model using the SHAP explaining method reveals that feature importance follows the order N > M > σ1 > E > ρ > σ3.Positive SHAP values indicate positive effects on predicting strain ε1 for N, M, σ1, and σ3, while negative SHAP values have negative effects. For E, a positive value has a negative effect on predicting strain ε1, consistent with the influence patterns of conventional physical rock constitutive equations. The present study offers a novel approach to the investigation of the mechanical constitutive model of rocks under cyclic loading and unloading conditions.

Comparison of regression model and LSTM-RNN model in predicting deterioration of prestressed concrete box girder bridges

  • Gao Jing;Lin Ruiying;Zhang Yao
    • Structural Engineering and Mechanics
    • /
    • v.91 no.1
    • /
    • pp.39-47
    • /
    • 2024
  • Bridge deterioration shows the change of bridge condition during its operation, and predicting bridge deterioration is important for implementing predictive protection and planning future maintenance. However, in practical application, the raw inspection data of bridges are not continuous, which has a greater impact on the accuracy of the prediction results. Therefore, two kinds of bridge deterioration models are established in this paper: one is based on the traditional regression theory, combined with the distribution fitting theory to preprocess the data, which solves the problem of irregular distribution and incomplete quantity of raw data. Secondly, based on the theory of Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN), the network is trained using the raw inspection data, which can realize the prediction of the future deterioration of bridges through the historical data. And the inspection data of 60 prestressed concrete box girder bridges in Xiamen, China are used as an example for validation and comparative analysis, and the results show that both deterioration models can predict the deterioration of prestressed concrete box girder bridges. The regression model shows that the bridge deteriorates gradually, while the LSTM-RNN model shows that the bridge keeps great condition during the first 5 years and degrades rapidly from 5 years to 15 years. Based on the current inspection database, the LSTM-RNN model performs better than the regression model because it has smaller prediction error. With the continuous improvement of the database, the results of this study can be extended to other bridge types or other degradation factors can be introduced to improve the accuracy and usefulness of the deterioration model.

Development and Evaluation of Validity of Dish Frequency Questionnaire (DFQ) and Short DFQ Using Na Index for Estimation of Habitual Sodium Intake (나트륨 섭취량 추정을 위한 음식섭취빈도조사지와 Na Index를 이용한 간이음식섭취빈도조사지의 개발 및 타당성 검증에 관한 연구)

  • Son, Sook-Mee;Huh, Gwui-Yeop;Lee, Hong-Sup
    • Korean Journal of Community Nutrition
    • /
    • v.10 no.5
    • /
    • pp.677-692
    • /
    • 2005
  • The assessment of sodium intake is complex because of the variety and nature of dietary sodium. This study intended to develop a dish frequency questionnaire (DFQ) for estimating the habitual sodium intake and a short DFQ for screening subjects with high or low sodium intake. For DFQ112, one hundred and twelve dish items were selected based on the information of sodium content of the one serving size and consumption frequency. Frequency of consumption was determined through nine categories ranging from more than 3 times a day to almost never to indicate how often the specified amount of each food item was consumed during the past 6 months. One hundred seventy one adults (male: 78, female: 93) who visited hypertension or health examination clinic participated in the validation study. DFQ55 was developed from DFQ112 by omitting the food items not frequently consumed, selecting the dish items that showed higher sodium content per one portion size and higher consumption frequency. To develop a short DFQs for classifying subjects with low or high sodium intakes, the weighed score according to the sodium content of one protion size was given to each dish item of DFQ25 or DFQ14 and multiplied with the consumption frequency score. A sum index of all the dish items was formed and called sodium index (Na index). For validation study the DFQ112, 2-day diet record and one 24-hour urine collection were analyzed to estimate sodium intakes. The sodium intakes estimated with DFQ112 and 24-h urine analysis showed $65\%$ agreement to be classified into the same quartile and showed significant correlation (r=0.563 p<0.05). However, the actual amount of sodium intake estimated with DFQ112 (male: 6221.9mg, female: 6127.6mg) showed substantial difference with that of 24-h urine analysis (male: 4556.9mg, female: 5107.4mg). The sodium intake estimated with DFQ55 (male: 4848.5mg, female: 4884.3mg) showed small difference from that estimated with 24-h urine analysis, higher proportion to be classfied into the same quartile and higher correlation with the sodium intakes estimated with 24-h urine analysis and systolic blood pressure. It seems DFQ55 can be used as a tool for quantitative estimation of sodium intake. Na index25 or Na index14 showed $39\~50\%$ agreement to be classified into the same quartile, substantial correlations with the sodium intake estimated with DFQ55 and significant correlations with the sodium intake estimated with 24-h urine analysis. When point 119 for Na index25 was used as a criterion of low sodium intake, sensitivity, specificity and positive predictive value was $62.5\%,\;81.8\%\;and\;53.2\%$, respectively. When point 102 for Na index14 was used as a criterion of high sodium intake, sensitivity, specificity and positive predictive value were $73.8\%,\;84.0\%,\;62.0\%$, respectively. It seems the short DFQs using Na index 14 or Na index25 are simple, easy and proper instruments to classify the low or high sodium intake group.

A Validation Study for the Practical Use of Screening Scale for Potential Drug-use Adolescents(SPDA) (청소년 약물사용 잠재군 선별척도(SPDA) 활용을 위한 타당화 연구)

  • Lee, Ki-Young;Kim, Young-Mi;Im, Hyuk;Park, Mi-Jin;Park, Sun-Hee
    • Korean Journal of Social Welfare
    • /
    • v.57 no.3
    • /
    • pp.305-335
    • /
    • 2005
  • This paper is a result from validation study for SPDA(A Screening Scale For Potential Drug-use Adolescents) created in 2003 and newly developed during 2004. SPDA aims to screen adolescents in their early stage of drug-use and to help practitioners make a preventive approach for the adolescents. 4307 junior and senior high school students were selected as primary research subjects by stratified and quota sampling methods. 305 adolescents on probation were also selected as a comparison group and asked to answer the same questionnaire. Reliability for SPDA recorded 0.914, which proved to be better than previous year's (0.898). Exploratory and confirmatory factor analyses to test construct validity proved that SPDA could be divided into 7 factors and that each factor structure of SPDA could be a proper measurement model with high level of fitness and factor loadings. Discriminant analysis to test predictive validity confirmed that SPDA could classify the adolescents excellently by the frequency of drug-use, with hit ratio of 86.6 percent(78.8% and 87.4% for junior and senior high school students respectively). For concurrent validity test, Hare Home Self-Esteem Scale, Hare School Self-Esteem, Zuckerman-Kuhlman Sensation-seeking Scale were employed to find correlation with SPDA and all the three scales had significant Pearson correlation coefficients with SPDA. Known-groups validity test indicated that SPDA had an adequate power to classify out adolescents on probation from those in schooling, with a hit ratio of 71.8 percent. Cut-off point to detect adolescents with high risk of substance use was 77, which indicated approximately T score, 55 (0.5 SD), satisfying sensitivity, specificity, and efficiency criteria.

  • PDF

Mathematical Transformation Influencing Accuracy of Near Infrared Spectroscopy (NIRS) Calibrations for the Prediction of Chemical Composition and Fermentation Parameters in Corn Silage (수 처리 방법이 근적외선분광법을 이용한 옥수수 사일리지의 화학적 조성분 및 발효품질의 예측 정확성에 미치는 영향)

  • Park, Hyung-Soo;Kim, Ji-Hye;Choi, Ki-Choon;Kim, Hyeon-Seop
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • v.36 no.1
    • /
    • pp.50-57
    • /
    • 2016
  • This study was conducted to determine the effect of mathematical transformation on near infrared spectroscopy (NIRS) calibrations for the prediction of chemical composition and fermentation parameters in corn silage. Corn silage samples (n=407) were collected from cattle farms and feed companies in Korea between 2014 and 2015. Samples of silage were scanned at 1 nm intervals over the wavelength range of 680~2,500 nm. The optical data were recorded as log 1/Reflectance (log 1/R) and scanned in intact fresh condition. The spectral data were regressed against a range of chemical parameters using partial least squares (PLS) multivariate analysis in conjunction with several spectral math treatments to reduce the effect of extraneous noise. The optimum calibrations were selected based on the highest coefficients of determination in cross validation ($R^2{_{cv}}$) and the lowest standard error of cross validation (SECV). Results of this study revealed that the NIRS method could be used to predict chemical constituents accurately (correlation coefficient of cross validation, $R^2{_{cv}}$, ranging from 0.77 to 0.91). The best mathematical treatment for moisture and crude protein (CP) was first-order derivatives (1, 16, 16, and 1, 4, 4), whereas the best mathematical treatment for neutral detergent fiber (NDF) and acid detergent fiber (ADF) was 2, 16, 16. The calibration models for fermentation parameters had lower predictive accuracy than chemical constituents. However, pH and lactic acids were predicted with considerable accuracy ($R^2{_{cv}}$ 0.74 to 0.77). The best mathematical treatment for them was 1, 8, 8 and 2, 16, 16, respectively. Results of this experiment demonstrate that it is possible to use NIRS method to predict the chemical composition and fermentation quality of fresh corn silages as a routine analysis method for feeding value evaluation to give advice to farmers.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

Recommendation of Nitrogen Topdressing Rates at Panicle Initiation Stage of Rice Using Canopy Reflectance

  • Nguyen, Hung T.;Lee, Kyu-Jong;Lee, Byun-Woo
    • Journal of Crop Science and Biotechnology
    • /
    • v.11 no.2
    • /
    • pp.141-150
    • /
    • 2008
  • The response of grain yield(GY) and milled-rice protein content(PC) to crop growth status and nitrogen(N) rates at panicle initiation stage(PIS) is critical information for prescribing topdress N rate at PIS(Npi) for target GY and PC. Three split-split-plot experiments including various N treatments and rice cultivars were conducted in Experimental Farm, Seoul National University, Korea in 2003-2005. Shoot N density(SND, g N in shoot $m^{-2}$) and canopy reflectance were measured before N application at PIS, and GY, PC, and SND were measured at harvest. Data from the first two years(2003-2004) were used for calibrating the predictive models for GY, PC, and SND accumulated from PIS to harvest using SND at PIS and Npi by multiple stepwise regression. After that the calibrated models were used for calculating N requirement at PIS for each of nine plots based on the target PC of 6.8% and the values of SND at PIS that was estimated by canopy reflectance method in the 2005 experiment. The result showed that SND at PIS in combination with Npi were successful to predict GY, PC, and SND from PIS to harvest in the calibration dataset with the coefficients of determination ($R^2$) of 0.87, 0.73, and 0.82 and the relative errors in prediction(REP, %) of 5.5, 4.3, and 21.1%, respectively. In general, the calibrated model equations showed a little lower performance in calculating GY, PC, and SND in the validation dataset(data from 2005) but REP ranging from 3.3% for PC and 13.9% for SND accumulated from PIS to harvest was acceptable. Nitrogen rate prescription treatment(PRT) for the target PC of 6.8% reduced the coefficient of variation in PC from 4.6% in the fixed rate treatment(FRT, 3.6g N $m^{-2}$) to 2.4% in PRT and the average PC of PRT was 6.78%, being very close to the target PC of 6.8%. In addition, PRT increased GY by 42.1 $gm^{-2}$ while Npi increased by 0.63 $gm^{-2}$ compared to the FRT, resulting in high agronomic N-use efficiency of 68.8 kg grain from additional kg N. The high agronomic N-use efficiency might have resulted from the higher response of grain yield to the applied N in the prescribed N rate treatment because N rate was prescribed based on the crop growth and N status of each plot.

  • PDF

3D Displays: Development and Validation of Prediction Function of Object Size Perception as a Function of Depth (3D 디스플레이: 깊이에 따른 대상의 크기지각 예측함수 개발 및 타당화)

  • Shin, Yoon-Ho;Li, Hyung-Chul O.;Kim, Shin-Woo
    • Journal of Broadcast Engineering
    • /
    • v.17 no.2
    • /
    • pp.400-410
    • /
    • 2012
  • In recent years, 3D displays are used in many media including 3D movies, TV, mobile phones, and PC games. Although 3D displays provide realistic viewing experience as compared with 2D displays, they also carry issues such as visual fatigue or size distortion. Focusing on the latter, we developed prediction function of object size perception as a function of object depth in 3D display. In Experiment 1, subjects observed 3D square of a fixed size of varying depth, and manipulated 2D square to make it as large as the 3D square. Conversely, in Experiment 2, subjects observed 2D square of a fixed size, and manipulated 3D square of varying depth to make it as large as the 2D square. In both Experiments 1 and 2, we found that size perception of 3D square linearly changed depending on depth of the square, and the linear relationship between depth and size was identical in both experiments. The predictive regression function, which predicts object size perception based on object depth, obtained in this research will be very useful in the creation of 3D media contents.