• 제목/요약/키워드: Validation Set

Search Result 679, Processing Time 0.022 seconds

Comparative Serum Proteomic Analysis of Serum Diagnosis Proteins of Colorectal Cancer Based on Magnetic Bead Separation and MALDI-TOF Mass Spectrometry

  • Deng, Bao-Guo;Yao, Jin-Hua;Liu, Qing-Yin;Feng, Xian-Jun;Liu, Dong;Zhao, Li;Tu, Bin;Yang, Fan
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.10
    • /
    • pp.6069-6075
    • /
    • 2013
  • Background: At present, the diagnosis of colorectal cancer (CRC) requires a colorectal biopsy which is an invasive procedure. We undertook this pilot study to develop an alternative method and potential new biomarkers for diagnosis, and validated a set of well-integrated tools called ClinProt to investigate the serum peptidome in CRC patients. Methods: Fasting blood samples from 67 patients diagnosed with CRC by histological diagnosis, 55 patients diagnosed with colorectal adenoma by biopsy, and 65 healthy volunteers were collected. Division was into a model construction group and an external validation group randomly. The present work focused on serum proteomic analysis of model construction group by ClinProt Kit combined with mass spectrometry. This approach allowed construction of a peptide pattern able to differentiate the studied populations. An external validation group was used to verify the diagnostic capability of the peptidome pattern blindly. An immunoassay method was used to determine serum CEA of CRC and controls. Results: The results showed 59 differential peptide peaks in CRC, colorectal adenoma and health volunteers. A genetic algorithm was used to set up the classification models. Four of the identified peaks at m/z 797, 810, 4078 and 5343 were used to construct peptidome patterns, achieving an accuracy of 100% (> CEA, P<0.05). Furthermore, the peptidome patterns could differentiate the validation group with high accuracy close to 100%. Conclusions: Our results showed that proteomic analysis of serum with MALDI-TOF MS is a fast and reproducible approach, which may provide a novel approach to screening for CRC.

A Metrics Set for Measuring Software Module Severity (소프트웨어 모듈 심각도 측정을 위한 메트릭 집합)

  • Hong, Euy-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.1
    • /
    • pp.197-206
    • /
    • 2015
  • Defect severity that is a measure of the impact caused by the defect plays an important role in software quality activities because not all software defects are equal. Earlier studies have concentrated on defining defect severity levels, but there have almost never been trials of measuring module severity. In this paper, first, we define a defect severity metric in the form of an exponential function using the characteristics that defect severity values increase much faster than severity levels. Then we define a new metrics set for software module severity using the number of defects in a module and their defect severity metric values. In order to show the applicability of the proposed metrics, we performed an analytical validation using Weyuker's properties and experimental validation using NASA open data sets. The results show that ms is very useful for measuring the module severity and msd can be used to compare different systems in terms of module severity.

Raman spectroscopic analysis to detect olive oil mixtures in argan oil

  • Joshi, Rahul;Cho, Byoung-Kwan;Joshi, Ritu;Lohumi, Santosh;Faqeerzada, Mohammad Akbar;Amanah, Hanim Z;Lee, Jayoung;Mo, Changyeun;Lee, Hoonsoo
    • Korean Journal of Agricultural Science
    • /
    • v.46 no.1
    • /
    • pp.183-194
    • /
    • 2019
  • Adulteration of argan oil with some other cheaper oils with similar chemical compositions has resulted in increasing demands for authenticity assurance and quality control. Fast and simple analytical techniques are thus needed for authenticity analysis of high-priced argan oil. Raman spectroscopy is a potent technique and has been extensively used for quality control and safety determination for food products In this study, Raman spectroscopy in combination with a net analyte signal (NAS)-based methodology, i.e., hybrid linear analysis method developed by Goicoechea and Olivieri in 1999 (HLA/GO), was used to predict the different concentrations of olive oil (0 - 20%) added to argan oil. Raman spectra of 90 samples were collected in a spectral range of $400-400cm^{-1}$, and calibration and validation sets were designed to evaluate the performance of the multivariate method. The results revealed a high coefficient of determination ($R^2$) value of 0.98 and a low root-mean-square error (RMSE) value of 0.41% for the calibration set, and an $R^2$ of 0.97 and RMSE of 0.36% for the validation set. Additionally, the figures of merit such as sensitivity, selectivity, limit of detection, and limit of quantification were used for further validation. The high $R^2$ and low RMSE values validate the detection ability and accuracy of the developed method and demonstrate its potential for quantitative determination of oil adulteration.

Performance of Prediction Models for Diagnosing Severe Aortic Stenosis Based on Aortic Valve Calcium on Cardiac Computed Tomography: Incorporation of Radiomics and Machine Learning

  • Nam gyu Kang;Young Joo Suh;Kyunghwa Han;Young Jin Kim;Byoung Wook Choi
    • Korean Journal of Radiology
    • /
    • v.22 no.3
    • /
    • pp.334-343
    • /
    • 2021
  • Objective: We aimed to develop a prediction model for diagnosing severe aortic stenosis (AS) using computed tomography (CT) radiomics features of aortic valve calcium (AVC) and machine learning (ML) algorithms. Materials and Methods: We retrospectively enrolled 408 patients who underwent cardiac CT between March 2010 and August 2017 and had echocardiographic examinations (240 patients with severe AS on echocardiography [the severe AS group] and 168 patients without severe AS [the non-severe AS group]). Data were divided into a training set (312 patients) and a validation set (96 patients). Using non-contrast-enhanced cardiac CT scans, AVC was segmented, and 128 radiomics features for AVC were extracted. After feature selection was performed with three ML algorithms (least absolute shrinkage and selection operator [LASSO], random forests [RFs], and eXtreme Gradient Boosting [XGBoost]), model classifiers for diagnosing severe AS on echocardiography were developed in combination with three different model classifier methods (logistic regression, RF, and XGBoost). The performance (c-index) of each radiomics prediction model was compared with predictions based on AVC volume and score. Results: The radiomics scores derived from LASSO were significantly different between the severe AS and non-severe AS groups in the validation set (median, 1.563 vs. 0.197, respectively, p < 0.001). A radiomics prediction model based on feature selection by LASSO + model classifier by XGBoost showed the highest c-index of 0.921 (95% confidence interval [CI], 0.869-0.973) in the validation set. Compared to prediction models based on AVC volume and score (c-indexes of 0.894 [95% CI, 0.815-0.948] and 0.899 [95% CI, 0.820-0.951], respectively), eight and three of the nine radiomics prediction models showed higher discrimination abilities for severe AS. However, the differences were not statistically significant (p > 0.05 for all). Conclusion: Models based on the radiomics features of AVC and ML algorithms may perform well for diagnosing severe AS, but the added value compared to AVC volume and score should be investigated further.

Total Bilirubin Level as a Predictor of Suboptimal Image Quality of the Hepatobiliary Phase of Gadoxetic Acid-Enhanced MRI in Patients with Extrahepatic Bile Duct Cancer

  • Jeong Ah Hwang;Ji Hye Min;Seong Hyun Kim;Seo-Youn Choi;Ji Eun Lee;Ji Yoon Moon
    • Korean Journal of Radiology
    • /
    • v.23 no.4
    • /
    • pp.389-401
    • /
    • 2022
  • Objective: This study aimed to determine a factor for predicting suboptimal image quality of the hepatobiliary phase (HBP) of gadoxetic acid-enhanced MRI in patients with extrahepatic bile duct (EHD) cancer before MRI examination. Materials and Methods: We retrospectively evaluated 259 patients (mean age ± standard deviation: 68.0 ± 8.3 years; 162 male and 97 female) with EHD cancer who underwent gadoxetic acid-enhanced MRI between 2011 and 2017. Patients were divided into a primary analysis set (n = 184) and a validation set (n = 75) based on the diagnosis date of January 2014. Two reviewers assigned the functional liver imaging score (FLIS) to reflect the HBP image quality. The FLIS consists of the sum of three HBP features, each scored on a 0-2 scale: liver parenchymal enhancement, biliary excretion, and signal intensity of the portal vein. Patients were classified into low-FLIS (0-3) or high-FLIS (4-6) groups. Multivariable analysis was performed to determine a predictor of low FLIS using serum biochemical and imaging parameters of cholestasis severity. The optimal cutoff value for predicting low FLIS was obtained using receiver operating characteristic analysis, and validation was performed. Results: Of the 259 patients, 140 (54.0%) and 119 (46.0%) were classified into the low-FLIS and high-FLIS groups, respectively. In the primary analysis set, total bilirubin was an independent factor associated with low FLIS (adjusted odds ratio per 1-mg/dL increase, 1.62; 95% confidence interval [CI], 1.32-1.98). The optimal cutoff value of total bilirubin for predicting low FLIS was 2.1 mg/dL with a sensitivity of 95.1% (95% CI: 88.9-98.4) and a specificity of 89.0% (95% CI: 80.2-94.9). In the validation set, the total bilirubin cutoff showed a sensitivity of 92.1% (95% CI: 78.6-98.3) and a specificity of 83.8% (95% CI: 68.0-93.8). Conclusion: Serum total bilirubin before acquisition of gadoxetic acid-enhanced MRI may help predict suboptimal HBP image quality in patients with EHD cancer.

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

  • Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.71-88
    • /
    • 2017
  • Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.

Engineering-scale Validation Test for the T-H-M Behaviors of a HLW Disposal System (고준위폐기물 처분시스템의 열적-수리적-역학적 거동 규명을 위한 공학적 규모의 실증시험)

  • Lee Jae-Owan;Park Jeong-Hwa;Cho Won-Jin
    • Journal of Nuclear Fuel Cycle and Waste Technology(JNFCWT)
    • /
    • v.4 no.2
    • /
    • pp.197-207
    • /
    • 2006
  • The engineering performance of a high level waste repository is significantly dependent upon the T-H-M behavior in the engineered barrier system. An engineering-scale test facility (KENTEX) was set up to validate the T-H-M behaviors in the buffer of a reference disposal system developed in the 2002. The validation tests started on May 31, 2005 and is now in progress. The KENTEX facility and validation test programme are introduced, and pre-operation calculations are also presented to give information on the sensitive location of sensors and operational conditions. This test will provide information (e.g., large-scale apparatus, sensors, monitoring system etc.) needed for 'in-situ' tests, make the validation of a T-H-M model for the T-H-M performance assessment of the reference disposal system, and demonstrate the engineering feasibility of fabricating and emplacing the buffer of a repository.

  • PDF

Prediction of concrete compressive strength using non-destructive test results

  • Erdal, Hamit;Erdal, Mursel;Simsek, Osman;Erdal, Halil Ibrahim
    • Computers and Concrete
    • /
    • v.21 no.4
    • /
    • pp.407-417
    • /
    • 2018
  • Concrete which is a composite material is one of the most important construction materials. Compressive strength is a commonly used parameter for the assessment of concrete quality. Accurate prediction of concrete compressive strength is an important issue. In this study, we utilized an experimental procedure for the assessment of concrete quality. Firstly, the concrete mix was prepared according to C 20 type concrete, and slump of fresh concrete was about 20 cm. After the placement of fresh concrete to formworks, compaction was achieved using a vibrating screed. After 28 day period, a total of 100 core samples having 75 mm diameter were extracted. On the core samples pulse velocity determination tests and compressive strength tests were performed. Besides, Windsor probe penetration tests and Schmidt hammer tests were also performed. After setting up the data set, twelve artificial intelligence (AI) models compared for predicting the concrete compressive strength. These models can be divided into three categories (i) Functions (i.e., Linear Regression, Simple Linear Regression, Multilayer Perceptron, Support Vector Regression), (ii) Lazy-Learning Algorithms (i.e., IBk Linear NN Search, KStar, Locally Weighted Learning) (iii) Tree-Based Learning Algorithms (i.e., Decision Stump, Model Trees Regression, Random Forest, Random Tree, Reduced Error Pruning Tree). Four evaluation processes, four validation implements (i.e., 10-fold cross validation, 5-fold cross validation, 10% split sample validation & 20% split sample validation) are used to examine the performance of predictive models. This study shows that machine learning regression techniques are promising tools for predicting compressive strength of concrete.

Prediction of Soluble Solid and Firmness in Apple by Reflectance Spectroscopy

  • Park, Chang-Hyun;Judith.A.Abbott
    • Near Infrared Analysis
    • /
    • v.1 no.1
    • /
    • pp.23-26
    • /
    • 2000
  • The objectives of this study were to examine the ability to predict soluble solid and firmness in intact apple based on the visible/near-infrared spectroscopic technique. Two cultivars of apples, Delicious and Gala, were handled, tested and analyzed. Reflectance spectra, Magness-Taylor (MT) Firmness, and soluble solids in apples were measured sequentially. Maximum and minimum diameters, height, and weight of apples were recorded before the MT firmness tests. Apple samples were divided in to a calibration set and a validation set. The method of partial least squares (PLS) analysis was used. a unique set of PLS loading vectors (factors) was development for soluble solid and firmness. The PLS model showed good relationship between predicted and measured soluble solids in intact apples in the wavelength range of 860∼1078 nm. However, the PLS analysis was not good enough to predict the apple firmness.

Enhanced MCTS Algorithm for Generating AI Agents in General Video Games (일반적인 비디오 게임의 AI 에이전트 생성을 위한 개선된 MCTS 알고리즘)

  • Oh, Pyeong;Kim, Ji-Min;Kim, Sun-Jeong;Hong, Seokmin
    • The Journal of Information Systems
    • /
    • v.25 no.4
    • /
    • pp.23-36
    • /
    • 2016
  • Purpose Recently, many researchers have paid much attention to the Artificial Intelligence fields of GVGP, PCG. The paper suggests that the improved MCTS algorithm to apply for the framework can generate better AI agent. Design/methodology/approach As noted, the MCTS generate magnificent performance without an advanced training and in turn, fit applying to the field of GVGP which does not need prior knowledge. The improved and modified MCTS shows that the survival rate is increased interestingly and the search can be done in a significant way. The study was done with 2 different sets. Findings The results showed that the 10 training set which was not given any prior knowledge and the other training set which played a role as validation set generated better performance than the existed MCTS algorithm. Besed upon the results, the further study was suggested.