• Title/Summary/Keyword: cross-validation

Search Result 998, Processing Time 0.035 seconds

Prediction accuracy of incisal points in determining occlusal plane of digital complete dentures

  • Kenta Kashiwazaki;Yuriko Komagamine;Sahaprom Namano;Ji-Man Park;Maiko Iwaki;Shunsuke Minakuchi;Manabu, Kanazawa
    • The Journal of Advanced Prosthodontics
    • /
    • v.15 no.6
    • /
    • pp.281-289
    • /
    • 2023
  • PURPOSE. This study aimed to predict the positional coordinates of incisor points from the scan data of conventional complete dentures and verify their accuracy. MATERIALS AND METHODS. The standard triangulated language (STL) data of the scanned 100 pairs of complete upper and lower dentures were imported into the computer-aided design software from which the position coordinates of the points corresponding to each landmark of the jaw were obtained. The x, y, and z coordinates of the incisor point (XP, YP, and ZP) were obtained from the maxillary and mandibular landmark coordinates using regression or calculation formulas, and the accuracy was verified to determine the deviation between the measured and predicted coordinate values. YP was obtained in two ways using the hamularincisive-papilla plane (HIP) and facial measurements. Multiple regression analysis was used to predict ZP. The root mean squared error (RMSE) values were used to verify the accuracy of the XP and YP. The RMSE value was obtained after crossvalidation using the remaining 30 cases of denture STL data to verify the accuracy of ZP. RESULTS. The RMSE was 2.22 for predicting XP. When predicting YP, the RMSE of the method using the HIP plane and facial measurements was 3.18 and 0.73, respectively. Cross-validation revealed the RMSE to be 1.53. CONCLUSION. YP and ZP could be predicted from anatomical landmarks of the maxillary and mandibular edentulous jaw, suggesting that YP could be predicted with better accuracy with the addition of the position of the lower border of the upper lip.

Application of Handheld Raman Spectroscopy for Pigment Identification of a Hanging Painting at Janggoksa Temple(Maitreya Buddha) (장곡사 미륵불 괘불탱의 채색 재료 분석을 위한 휴대용 라만 분광기의 적용성 연구)

  • LEE Na Ra;YOO Youngmi;KIM Sojin
    • Korean Journal of Heritage: History & Science
    • /
    • v.56 no.4
    • /
    • pp.216-228
    • /
    • 2023
  • The purpose of this study is to apply the handheld Raman spectrometer to identify the coloring materials used in a large Buddhist painting (of Maitreya Buddha) at Janggoksa Temple through cross-validation with HH-XRF. An in situ investigation was performed together with use of a digital microscope and HH-XRF analysis to verify the properties of pigments used in the gwaebul ("large Buddhist painting") via a non-destructive method. However, the identification of coloring materials composed of light elements and mixed or overlaid pigments is difficult using only non-destructive analysis data. Unlike in situ investigation, laboratory analysis often required samples yet the sampling is restricted to a small quantity due to the cultural heritage characteristic. Thus, it is necessary to develop a non-destructive in situ method to supplement the HH-XRF data. The large Buddhist painting at Janggoksa Temple was painted mainly using white, red, yellow, green, and blue colors. The Raman spectroscopy provides molecular information, while XRF spectroscopy provides information about elemental composition of the pigments. Analysis results identified various coloring materials: inorganic pigment, such as lead white, minium, cinnabar, and orpiment, as well as organic pigment such as gamboge and indigo. Therefore, it is possible to obtain more information for the identification of pigments; organic pigment and mixed or overlaid pigments, while at the same time minimizing the collection sample and simplifying the analysis procedure compared to previously used methods. The results of this study will be used as basic data for the analysis of painting cultural heritage through a non-destructive in situ method in the future.

Predicting blast-induced ground vibrations at limestone quarry from artificial neural network optimized by randomized and grid search cross-validation, and comparative analyses with blast vibration predictor models

  • Salman Ihsan;Shahab Saqib;Hafiz Muhammad Awais Rashid;Fawad S. Niazi;Mohsin Usman Qureshi
    • Geomechanics and Engineering
    • /
    • v.35 no.2
    • /
    • pp.121-133
    • /
    • 2023
  • The demand for cement and limestone crushed materials has increased many folds due to the tremendous increase in construction activities in Pakistan during the past few decades. The number of cement production industries has increased correspondingly, and so the rock-blasting operations at the limestone quarry sites. However, the safety procedures warranted at these sites for the blast-induced ground vibrations (BIGV) have not been adequately developed and/or implemented. Proper prediction and monitoring of BIGV are necessary to ensure the safety of structures in the vicinity of these quarry sites. In this paper, an attempt has been made to predict BIGV using artificial neural network (ANN) at three selected limestone quarries of Pakistan. The ANN has been developed in Python using Keras with sequential model and dense layers. The hyper parameters and neurons in each of the activation layers has been optimized using randomized and grid search method. The input parameters for the model include distance, a maximum charge per delay (MCPD), depth of hole, burden, spacing, and number of blast holes, whereas, peak particle velocity (PPV) is taken as the only output parameter. A total of 110 blast vibrations datasets were recorded from three different limestone quarries. The dataset has been divided into 85% for neural network training, and 15% for testing of the network. A five-layer ANN is trained with Rectified Linear Unit (ReLU) activation function, Adam optimization algorithm with a learning rate of 0.001, and batch size of 32 with the topology of 6-32-32-256-1. The blast datasets were utilized to compare the performance of ANN, multivariate regression analysis (MVRA), and empirical predictors. The performance was evaluated using the coefficient of determination (R2), mean absolute error (MAE), mean squared error (MSE), mean absolute percentage error (MAPE), and root mean squared error (RMSE)for predicted and measured PPV. To determine the relative influence of each parameter on the PPV, sensitivity analyses were performed for all input parameters. The analyses reveal that ANN performs superior than MVRA and other empirical predictors, andthat83% PPV is affected by distance and MCPD while hole depth, number of blast holes, burden and spacing contribute for the remaining 17%. This research provides valuable insights into improving safety measures and ensuring the structural integrity of buildings near limestone quarry sites.

A Study on the Prediction of Uniaxial Compressive Strength Classification Using Slurry TBM Data and Random Forest (이수식 TBM 데이터와 랜덤포레스트를 이용한 일축압축강도 분류 예측에 관한 연구)

  • Tae-Ho Kang;Soon-Wook Choi;Chulho Lee;Soo-Ho Chang
    • Tunnel and Underground Space
    • /
    • v.33 no.6
    • /
    • pp.547-560
    • /
    • 2023
  • Recently, research on predicting ground classification using machine learning techniques, TBM excavation data, and ground data is increasing. In this study, a multi-classification prediction study for uniaxial compressive strength (UCS) was conducted by applying random forest model based on a decision tree among machine learning techniques widely used in various fields to machine data and ground data acquired at three slurry shield TBM sites. For the classification prediction, the training and test data were divided into 7:3, and a grid search including 5-fold cross-validation was used to select the optimal parameter. As a result of classification learning for UCS using a random forest, the accuracy of the multi-classification prediction model was found to be high at both 0.983 and 0.982 in the training set and the test set, respectively. However, due to the imbalance in data distribution between classes, the recall was evaluated low in class 4. It is judged that additional research is needed to increase the amount of measured data of UCS acquired in various sites.

Ultrafast MRI and T1 and T2 Radiomics for Predicting Invasive Components in Ductal Carcinoma in Situ Diagnosed With Percutaneous Needle Biopsy

  • Min Young Kim;Heera Yoen;Hye Ji;Sang Joon Park;Sun Mi Kim;Wonshik Han;Nariya Cho
    • Korean Journal of Radiology
    • /
    • v.24 no.12
    • /
    • pp.1190-1199
    • /
    • 2023
  • Objective: This study aimed to investigate the feasibility of ultrafast magnetic resonance imaging (MRI) and radiomic features derived from breast MRI for predicting the upstaging of ductal carcinoma in situ (DCIS) diagnosed using percutaneous needle biopsy. Materials and Methods: Between August 2018 and June 2020, 95 patients with 98 DCIS lesions who underwent preoperative breast MRI, including an ultrafast sequence, and subsequent surgery were included. Four ultrafast MRI parameters were analyzed: time-to-enhancement, maximum slope (MS), area under the curve for 60 s after enhancement, and time-to-peak enhancement. One hundred and seven radiomic features were extracted for the whole tumor on the first post-contrast T1WI and T2WI using PyRadiomics. Clinicopathological characteristics, ultrafast MRI findings, and radiomic features were compared between the pure DCIS and DCIS with invasion groups. Prediction models, incorporating clinicopathological, ultrafast MRI, and radiomic features, were developed. Receiver operating characteristic curve analysis and area under the curve (AUC) were used to evaluate model performance in distinguishing between the two groups using leave-one-out cross-validation. Results: Thirty-six of the 98 lesions (36.7%) were confirmed to have invasive components after surgery. Compared to the pure DCIS group, the DCIS with invasion group had a higher nuclear grade (P < 0.001), larger mean lesion size (P = 0.038), larger mean MS (P = 0.002), and different radiomic-related characteristics, including a more extensive tumor volume; higher maximum gray-level intensity; coarser, more complex, and heterogeneous texture; and a greater concentration of high gray-level intensity. No significant differences in AUCs were found between the model incorporating nuclear grade and lesion size (0.687) and the models integrating additional ultrafast MRI and radiomic features (0.680-0.732). Conclusion: High nuclear grade, larger lesion size, larger MS, and multiple radiomic features were associated with DCIS upstaging. However, the addition of MS and radiomic features to the prediction model did not significantly improve the prediction performance.

A Method for Extracting Equipment Specifications from Plant Documents and Cross-Validation Approach with Similar Equipment Specifications (플랜트 설비 문서로부터 설비사양 추출 및 유사설비 사양 교차 검증 접근법)

  • Jae Hyun Lee;Seungeon Choi;Hyo Won Suh
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.2
    • /
    • pp.55-68
    • /
    • 2024
  • Plant engineering companies create or refer to requirements documents for each related field, such as plant process/equipment/piping/instrumentation, in different engineering departments. The process-related requirements document includes not only a description of the process but also the requirements of the equipment or related facilities that will operate it. Since the authors and reviewers of the requirements documents are different, there is a possibility that inconsistencies may occur between equipment or parts design specifications described in different requirement documents. Ensuring consistency in these matters can increase the reliability of the overall plant design information. However, the amount of documents and the scattered nature of requirements for a same equipment and parts across different documents make it challenging for engineers to trace and manage requirements. This paper proposes a method to analyze requirement sentences and calculate the similarity of requirement sentences in order to identify semantically identical sentences. To calculate the similarity of requirement sentences, we propose a named entity recognition method to identify compound words for the parts and properties that are semantically central to the requirements. A method to calculate the similarity of the identified compound words for parts and properties is also proposed. The proposed method is explained using sentences in practical documents, and experimental results are described.

The Optimization of Ensembles for Bankruptcy Prediction (기업부도 예측 앙상블 모형의 최적화)

  • Myoung Jong Kim;Woo Seob Yun
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.39-57
    • /
    • 2022
  • This paper proposes the GMOPTBoost algorithm to improve the performance of the AdaBoost algorithm for bankruptcy prediction in which class imbalance problem is inherent. AdaBoost algorithm has the advantage of providing a robust learning opportunity for misclassified samples. However, there is a limitation in addressing class imbalance problem because the concept of arithmetic mean accuracy is embedded in AdaBoost algorithm. GMOPTBoost can optimize the geometric mean accuracy and effectively solve the category imbalance problem by applying Gaussian gradient descent. The samples are constructed according to the following two phases. First, five class imbalance datasets are constructed to verify the effect of the class imbalance problem on the performance of the prediction model and the performance improvement effect of GMOPTBoost. Second, class balanced data are constituted through data sampling techniques to verify the performance improvement effect of GMOPTBoost. The main results of 30 times of cross-validation analyzes are as follows. First, the class imbalance problem degrades the performance of ensembles. Second, GMOPTBoost contributes to performance improvements of AdaBoost ensembles trained on imbalanced datasets. Third, Data sampling techniques have a positive impact on performance improvement. Finally, GMOPTBoost contributes to significant performance improvement of AdaBoost ensembles trained on balanced datasets.

MRI Predictors of Malignant Transformation in Patients with Inverted Papilloma: A Decision Tree Analysis Using Conventional Imaging Features and Histogram Analysis of Apparent Diffusion Coefficients

  • Chong Hyun Suh;Jeong Hyun Lee;Mi Sun Chung;Xiao Quan Xu;Yu Sub Sung;Sae Rom Chung;Young Jun Choi;Jung Hwan Baek
    • Korean Journal of Radiology
    • /
    • v.22 no.5
    • /
    • pp.751-758
    • /
    • 2021
  • Objective: Preoperative differentiation between inverted papilloma (IP) and its malignant transformation to squamous cell carcinoma (IP-SCC) is critical for patient management. We aimed to determine the diagnostic accuracy of conventional imaging features and histogram parameters obtained from whole tumor apparent diffusion coefficient (ADC) values to predict IP-SCC in patients with IP, using decision tree analysis. Materials and Methods: In this retrospective study, we analyzed data generated from the records of 180 consecutive patients with histopathologically diagnosed IP or IP-SCC who underwent head and neck magnetic resonance imaging, including diffusion-weighted imaging and 62 patients were included in the study. To obtain whole tumor ADC values, the region of interest was placed to cover the entire volume of the tumor. Classification and regression tree analyses were performed to determine the most significant predictors of IP-SCC among multiple covariates. The final tree was selected by cross-validation pruning based on minimal error. Results: Of 62 patients with IP, 21 (34%) had IP-SCC. The decision tree analysis revealed that the loss of convoluted cerebriform pattern and the 20th percentile cutoff of ADC were the most significant predictors of IP-SCC. With these decision trees, the sensitivity, specificity, accuracy, and C-statistics were 86% (18 out of 21; 95% confidence interval [CI], 65-95%), 100% (41 out of 41; 95% CI, 91-100%), 95% (59 out of 61; 95% CI, 87-98%), and 0.966 (95% CI, 0.912-1.000), respectively. Conclusion: Decision tree analysis using conventional imaging features and histogram analysis of whole volume ADC could predict IP-SCC in patients with IP with high diagnostic accuracy.

A Study on the Drug Classification Using Machine Learning Techniques (머신러닝 기법을 이용한 약물 분류 방법 연구)

  • Anmol Kumar Singh;Ayush Kumar;Adya Singh;Akashika Anshum;Pradeep Kumar Mallick
    • Advanced Industrial SCIence
    • /
    • v.3 no.2
    • /
    • pp.8-16
    • /
    • 2024
  • This paper shows the system of drug classification, the goal of this is to foretell the apt drug for the patients based on their demographic and physiological traits. The dataset consists of various attributes like Age, Sex, BP (Blood Pressure), Cholesterol Level, and Na_to_K (Sodium to Potassium ratio), with the objective to determine the kind of drug being given. The models used in this paper are K-Nearest Neighbors (KNN), Logistic Regression and Random Forest. Further to fine-tune hyper parameters using 5-fold cross-validation, GridSearchCV was used and each model was trained and tested on the dataset. To assess the performance of each model both with and without hyper parameter tuning evaluation metrics like accuracy, confusion matrices, and classification reports were used and the accuracy of the models without GridSearchCV was 0.7, 0.875, 0.975 and with GridSearchCV was 0.75, 1.0, 0.975. According to GridSearchCV Logistic Regression is the most suitable model for drug classification among the three-model used followed by the K-Nearest Neighbors. Also, Na_to_K is an essential feature in predicting the outcome.

Non Destructive Fast Determination of Fatty Acid Composition by Near Infrared Reflectance Spectroscopy in Sesame

  • Kang, Churl-Whan;Kim, Dong-Hwi;Lee, Sung-Woo;Kim, Ki-Jong;Cho, Kyu-Chae;Shim, Kang-Bo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.51 no.spc1
    • /
    • pp.283-291
    • /
    • 2006
  • To investigate seed non destructive and fast determination technique utilizing near infrared reflectance spectroscopy (NIRs) for screening ultra high oleic (C18:1) and linoleic (C18:2) fatty acid content sesame varieties among genetic resources and lines of pedigree generations of cross and mutation breeding were carried out in National Institute of Crop Science (NICS). 150 among 378 landraces and introduced cultivars were released to analyse fatty acids by NIRs and gas chromatography (GC). Average content of each fatty acid was 9.64% in palmitic acid (C16:0), 4.73% in stearic acid (C18:0), 42.26% in oleic acid and 43.38% in linoleic acid by GC. The content range of each fatty acid was from 7.29 to 12.27% in palmitic, 6.49% from 2.39 to 8.88% in stearic, 12.59% of wider range compared to that of stearic and palmitic from 37.36 to 49.95% in oleic and of the widest from 30.60 to 47.40% in linoleic acid. Spectrums analyzed by NIRs were distributed from 400 to 2,500 nm wavelengths and varietal distribution of fatty acids were appeared as regular distribution. Varietal differences of oleic acid content good for food processing and human health by NIRs was 14.08% of which 1.49% wider range than that of GC from 38.31 to 52.39%. Varietal differences of linoleic acid content by NIRs was 16.41% of which 0.39% narrower range than that of GC from 30.60 to 47.01%. Varietal differences of oleic and linoleic acid content in NIRs analysis were appeared relatively similar inclination compared with those of GC. Partial least square regression (PLSR) among multiple variant regression (MVR) in NIRs calibration statistics was carried out in spectrum characteristics on the wavelength from 700 to 2,500 nm with oleic and linoleic acids. Correlation coefficient of root square (RSQ) in oleic acid content was 0.724 of which 72.4 percent of sample varieties among all distributed in the range of 0.570 percent of standard error when calibrated (SEC) which were considerably acceptable in statistic confidence significantly for analysis between NIRs and GC. Standard error of cross validation (SECV) of oleic acid was 0.725 of which distributed in the range of 0.725 percent standard error among the samples of mother population between analyzed value by NIRs analysis and analyzed value by GC. RSQ of linoleic acid content was 0.735 of which 73.5 percent of sample varieties among all distributed in the range of 0.643 percent of SEC. SECV of linoleic acid was 0.711 of which distributed in the range of 0.711 percent standard error among the samples of mother population between NIRs analysis and GC analysis. Consequently, adoption NIR analysis for fatty acids of oleic and linoleic instead that of GC was recognized statistically significant between NIRs and GC analysis through not only majority of samples distributed in the range of negligible SEC but also SECV. For enlarging and increasing statistic significance of NIRs analysis, wider range of fatty acids contented sesame germplasm should be kept on releasing additionally for increasing correlation coefficient of RSQ and reducing SEC and SECV in the future.