• Title/Summary/Keyword: 교차검증법

Search Result 100, Processing Time 0.024 seconds

A Node2Vec-Based Gene Expression Image Representation Method for Effectively Predicting Cancer Prognosis (암 예후를 효과적으로 예측하기 위한 Node2Vec 기반의 유전자 발현량 이미지 표현기법)

  • Choi, Jonghwan;Park, Sanghyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.10
    • /
    • pp.397-402
    • /
    • 2019
  • Accurately predicting cancer prognosis to provide appropriate treatment strategies for patients is one of the critical challenges in bioinformatics. Many researches have suggested machine learning models to predict patients' outcomes based on their gene expression data. Gene expression data is high-dimensional numerical data containing about 17,000 genes, so traditional researches used feature selection or dimensionality reduction approaches to elevate the performance of prognostic prediction models. These approaches, however, have an issue of making it difficult for the predictive models to grasp any biological interaction between the selected genes because feature selection and model training stages are performed independently. In this paper, we propose a novel two-dimensional image formatting approach for gene expression data to achieve feature selection and prognostic prediction effectively. Node2Vec is exploited to integrate biological interaction network and gene expression data and a convolutional neural network learns the integrated two-dimensional gene expression image data and predicts cancer prognosis. We evaluated our proposed model through double cross-validation and confirmed superior prognostic prediction accuracy to traditional machine learning models based on raw gene expression data. As our proposed approach is able to improve prediction models without loss of information caused by feature selection steps, we expect this will contribute to development of personalized medicine.

Bioequivalence of pioglitazone tablet to Actos® tablet (Pioglitazone 30 mg) (액토스정®(피오글리타존 30 mg)에 대한 염산피오글리타존정의 생물학적동등성)

  • Yeom, Hyesun;Lee, Tae Ho;Youm, Jeong-Rok;Song, Jin-Ho;Han, Sang Beom
    • Analytical Science and Technology
    • /
    • v.22 no.1
    • /
    • pp.101-108
    • /
    • 2009
  • The bioequivalence of two pioglitazone tablets, Actos$^{(R)}$ tablet (Takeda Chemical Industries, reference drug) and Pioglitazone tablet (Boryung Company, test drug) was evaluated according to the guidelines of Korea Food and Drug Administration. Twenty-eight healthy male Korean volunteers received each medicine (pioglitazone dose of 30 mg) in a $2{\times}2$ crossover study with one week washout interval. After drug administration, blood samples were collected at specific time intervals from 0-36 hours. The plasma concentrations of pioglitazone were determined by high performance liquid chromatography-tandem mass spectrometry (LC-MS/MS). The total chromatographic run time was 5 min and calibration curves were linear over the concentration range of 5-2000 ng/mL for pioglitazone. The method was validated for selectivity, sensitivity, linearity, accuracy and precision. The pharmacokinetic parameters were determined from the plasma concentration-time profiles of both formulations. The primary calculated pharmacokinetic parameters were compared statistically to evaluate bioequivalence between the two preparations. The 90% confidence intervals of the $AUC_t$ ratio and the $C_{max}$ ratio for Pioglitazone tablet and Actos$^{(R)}$ tablet were log0.9422~log1.1040 and log0.9200~log1.1556, respectively. Based on the statistical considerations, we can conclude that the test drug, Pioglitazone tablet was bioequivalent to the reference drug, Actos$^{(R)}$ tablet.

Development of Prediction Model for Capsaicinoids Content in Red-Pepper Powder Using Near-Infrared Spectroscopy - Particle Size Effect (근적외선 스펙트럼을 이용한 고춧가루의 캡사이신 함량 예측 모델 개발 - 입자의 영향)

  • Mo, Changyeun;Kang, Sukwon;Lee, Kangjin;Lim, Jong-Guk;Cho, Byoung-Kwan;Lee, Hyun-Dong
    • Food Engineering Progress
    • /
    • v.15 no.1
    • /
    • pp.48-55
    • /
    • 2011
  • In this research, the near-infrared absorption from 1,100-2,300 nm was used to measure the content of capsaicinoids in the red-pepper powder by using the Acousto-optic tunable filters (AOTF) spectrometer with sample plate and sample rotating unit. Non-spicy red-pepper samples from one location (Younggwang-gun. Korea) were mixed with spicy one (var. Chungyang) to make samples separated by particle size (below 0.425 mm, 0.425-0.71 mm, and 0.71- 1.4 mm). The Partial Least Squares Regression (PLSR) model to predict the capsaicinoid content on particle sizes was developed with measured spectra by AOTF spectrometer and used to analyze the amount of capsaicinoids by HPLC. The PLSR Model of red-pepper powder of below 0.425 mm, 0.425-0.71 mm, and 0.71-1.4 mm with cross validation had ${R_V}^2$ = 0.948-0.979 and Standard Error of Prediction (SEP) = 6.56-7.94 mg%. The prediction error of smaller particle size of red-pepper powder was low. The best PLSR model was found in pretreatment of Range Normalization, Standard Normal Variate, and 1st Derivatives of red-pepper powder of below 1.4 mm with cross validation, having ${R_V}^2$ = 0.959 and SEP = 8.82 mg%.

Analysis of Characteristics and Optimization of Photo-degradation condition of Reactive Orange 16 Using a Box-Behnken Method (실험계획법 중 Box-Behnken(박스-벤켄)법을 이용한 반응성 염료의 광촉매 산화조건 특성 해석 및 최적화)

  • Cho, Il-Hyoung;Lee, Nae-Hyun;Chang, Soon-Woong;An, Sang-Woo;Yonn, Young-Han;Zoh, Kyung-Duk
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.28 no.9
    • /
    • pp.917-925
    • /
    • 2006
  • The aim of our research was to apply experimental design methodology in the optimization of photocatalytic degradation of azo dye(Reactive orange 16). The reactions were mathematically described as a function of parameters amount of $TiO_2(x_1)$, and dye concentration($x_2$) being modeled by the use of the Box-Behnken method. The results show that the responses of color removal(%)($Y_1$) in photocatalysis of dyes were significantly affected by the synergistic effect of linear term of $TiO_2(x_1)$ and dye concentration($x_2$). Significant factors and synergistic effects for the $COD_{Cr}$, removal(%)($Y_2$) were the linear term of $TiO_2(x_1)$ and dye concentration($x_2$). However, the quadratic term of $TiO_2(x_1^2)$ and dye concentration($x_2^2$) had an antagonistic effect on $Y_1$ and $Y_2$ responses. Canonical analysis indicates that the stationary point was a saddle point for $Y_1$ and $Y_2$, respectively. The estimated ridge of maximum responses and optimal conditions for $Y_1:(X_1,\;X_2)$=(1.11 g/L, 51.2 mg/L) and $Y_2:(X_1,\;X_2)$=(1.42 g/L, 72.83 mg/L) using canonical analysis was 93% and 73%, respectively.

Development of Forest Volume Estimation Model Using Airborne LiDAR Data - A Case Study of Mixed Forest in Aedang-ri, Chunyang-myeon, Bonghwa-gun - (항공 LiDAR 자료를 이용한 산림재적추정 모델 개발 - 봉화군 춘양면 애당리 혼효림을 대상으로 -)

  • CHO, Seung-Wan;KIM, Yong-Ku;PARK, Joo-Won
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.3
    • /
    • pp.181-194
    • /
    • 2017
  • This study aims to develop a regression model for forest volume estimation using field-collected forest inventory information and airborne LiDAR data. The response variable of the model is forest stem volume, was measured by random sampling from each individual plot of the 30 circular sample plots collected in Bonghwa-gun, Gyeong sangbuk-do, while the predictor variables for the model are Height Percentiles(HP) and Height Bin(HB), which are metrics extracted from raw LiDAR data. In order to find the most appropriate model, the candidate models are constructed from simple linear regression, quadratic polynomial regression and multiple regression analysis and the cross-validation tests were conducted for verification purposes. As a result, $R^2$ of the multiple regression models of $HB_{5-10}$, $HB_{15-20}$, $HB_{20-25}$, and $HBgt_{25}$ among the estimated models was the highest at 0.509, and the PRESS statistic of the simple linear regression model of $HP_{25}$ was the lowest at 122.352. $HB_{5-10}$, $HB_{15-20}$, $HB_{20-25}$, and $HBgt_{25}-based$ models, thus, are comparatively considered more appropriate for Korean forests with complicated vertical structures.

Verification of a Calibration Technique for a Full-Polarimetric Scatterometer System at C-band (C-밴드 완전 편파 측정용 스캐터미터 시스템 보정 기술 검증)

  • Park, Sin-Myeong;Go, Joo-Seoc;Joo, Jeong-Myeong;Kim, Hee-Young;Kim, Ju-Hui;Hwang, Ji-Hwan;Kwon, Soon-Gu;Shin, Jong-Chul;Oh, Yisok
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.23 no.10
    • /
    • pp.1196-1203
    • /
    • 2012
  • This paper presents a study on the calibration of a C-band HPS(Hongik Polarimetric Scatterometer) system using the DMMCT(Differential Mueller Matrix Calibration Technique). For calibration of the polarimetric scatterometer system, a fully-polarimetric antenna pattern(magnitudes and phase-differences) of the antenna main-beam is measured using a conducting sphere at anechoic chamber. The polarimetric scatterometer system could be accurately calibrated after retrieving its distortions using the DMMCT. Unlike a single-polarimetric system, in a fully-polarimetric system, not only backscattering coefficients but also phase differences are important parameters. This calibrated HPS system can be used to measure accurate Mueller matrices of bare soil surfaces, rice paddies, and vegetation fields. The phase-difference parameters as well as the backscattering coefficients for co- and cross-polarizations can then be obtained. The accuracy of calibration was verified by comparing the measured backscattering coefficients with a scattering model. The measured polarization response of a plowed bare field was also compared with the polarization response which was synthesized using a polarimetric scattering model for verifying the calibration technique.

Evaluation on Reactivity of By-Product Pozzolanic Materials Using Electrical Conductivity Measurement (전기전도도 시험방법을 활용한 산업부산물 포졸란재료의 반응성 평가)

  • Choi, Ik-Je;Kim, Ji-Hyun;Lee, Soo-Yong
    • Journal of the Korea Institute of Building Construction
    • /
    • v.16 no.5
    • /
    • pp.421-428
    • /
    • 2016
  • In this work, pozzolanic activities of various waste materials were compared with those of well-known by-product pozzolanic materials. Undensified and densified silica fume, ASTM class F and class C fly ash, and metakaolin were chosen as well-known pozzolanic materials, and bentonite powder, ceramic powder obtained from wash basin, and waste glass wool, which can possibly possess pozzolanic property, were chosen for comparison. Drop in electrical conductivity at $40^{\circ}C$ saturated lime solution was measured for each materials. The amount of Ca(OH)2 decomposed from cement paste at $450{\sim}500^{\circ}C$ was also measured to evaluate pozzolanic activity. The 28 day compressive strength were used to observe the mechanical property enhanced by incorporation of various waste materials. According to the experimental results, using "difference between maximum conductivity value and conductivity value at 4 hour" was found to be a reasonable approach to determine pozzolanic activity of a material. Pozzolanic activity measured using electrical conductivity correlates very well with that measured using the amount of Ca(OH)2 remained in the cement paste. Relatively good agreement was also found with electrical conductivity and 28 day compressive strength. It was found that electrical conductivity measurement can be used to evaluate pozzolanic activity of unknown materials.

The Optimization of Ensembles for Bankruptcy Prediction (기업부도 예측 앙상블 모형의 최적화)

  • Myoung Jong Kim;Woo Seob Yun
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.39-57
    • /
    • 2022
  • This paper proposes the GMOPTBoost algorithm to improve the performance of the AdaBoost algorithm for bankruptcy prediction in which class imbalance problem is inherent. AdaBoost algorithm has the advantage of providing a robust learning opportunity for misclassified samples. However, there is a limitation in addressing class imbalance problem because the concept of arithmetic mean accuracy is embedded in AdaBoost algorithm. GMOPTBoost can optimize the geometric mean accuracy and effectively solve the category imbalance problem by applying Gaussian gradient descent. The samples are constructed according to the following two phases. First, five class imbalance datasets are constructed to verify the effect of the class imbalance problem on the performance of the prediction model and the performance improvement effect of GMOPTBoost. Second, class balanced data are constituted through data sampling techniques to verify the performance improvement effect of GMOPTBoost. The main results of 30 times of cross-validation analyzes are as follows. First, the class imbalance problem degrades the performance of ensembles. Second, GMOPTBoost contributes to performance improvements of AdaBoost ensembles trained on imbalanced datasets. Third, Data sampling techniques have a positive impact on performance improvement. Finally, GMOPTBoost contributes to significant performance improvement of AdaBoost ensembles trained on balanced datasets.

Improvement of an Analytical Method for Methoprene in Livestock Products using LC-MS/MS (LC-MS/MS를 이용한 축산물 중 살충제 메토프렌의 잔류분석법 개선)

  • Park, Eun-Ji;Kim, Nam Young;Park, So-Ra;Lee, Jung Mi;Jung, Yong Hyun;Yoon, Hae Jung
    • Journal of Food Hygiene and Safety
    • /
    • v.37 no.3
    • /
    • pp.136-142
    • /
    • 2022
  • The research aims to develop a rapid and easy analytical method for methoprene using liquid chromatography-tandem mass spectrometry (LC-MS/MS). A simple, highly sensitive, and specific analytical method for the determination of methoprene in livestock products (beef, pork, chicken, milk, eggs, and fat) was developed. Methoprene was effectively extracted with 1% acetic acid in acetonitrile and acetone (1:1), followed by the addition of anhydrous magnesium sulfate (MgSO4) and anhydrous sodium acetate. Subsequently, the lipids in the livestock sample were extracted by freezing them at -20℃. The extracts were cleaned using MgSO4, primary secondary amine (PSA), and octadecyl (C18), which were then centrifuged to separate the supernatant. Nitrogen gas was used to evaporate the supernatant, which was then dissolved in methanol. The matrix-matched calibration curves were constructed using 8 levels (1, 2.5, 5, 10, 25, 50, 100, 150 ng/mL) and the coefficient of determination (R2) was above 0.9964. Average recoveries spiked at three levels (0.01, 0.1, and 0.5 mg/kg), and ranged from 79.5-105.1%, with relative standard deviations (RSDs) smaller than 14.2%, as required by the Codex guideline (CODEX CAC/GL 40). This study could be useful for residue safety management in livestock products.

Development and Validation of Real-time PCR to Determine Branchiostegus japonicus and B. albus Species Based on Mitochondrial DNA (Real-time PCR 분석법을 이용한 옥돔과 옥두어의 종 판별법 개발)

  • Chung, In Young;Seo, Yong Bae;Yang, Ji-Young;Kim, Gun-Do
    • Journal of Life Science
    • /
    • v.27 no.11
    • /
    • pp.1331-1339
    • /
    • 2017
  • DNA barcoding is the identification of a species based on the DNA sequence of a fragment of the cytochrome C oxidase subunit I (COI) gene in the mitochondrial genome. It is widely applied to assist with the sustainable development of fishery-product resources and the protection of fish biodiversity. This study attempted to verify horse-head fish (Branchiostegus japonicus) and fake horse-head fish (Branchiostegus albus) species, which are commonly consumed in Korea. For the validation of the two species, a real-time PCR method was developed based on the species' mitochondrial DNA genome. Inter-species variations in mitochondrial DNA were observed in a bioinformatics analysis of the mitochondrial genomic DNA sequences of the two species. Some highly conserved regions and a few other regions were identified in the mitochondrial COI of the species. In order to test whether variations in the sequences were definitive, primers that targeted the varied regions of COI were designed and applied to amplify the DNA using the real-time PCR system. Threshold-cycle (Ct) range results confirmed that the Ct ranges of the real-time PCR were identical to the expected species of origin. Efficiency, specificity and cross-reactivity assays showed statistically significant differences between the average Ct of B. japonicus DNA ($21.85{\pm}3.599$) and the average Ct of B. albus DNA ($33.49{\pm}1.183$) for confirming B. japonicus. The assays also showed statistically significant differences between the average Ct of B. albus DNA ($22.49{\pm}0.908$) and the average Ct of B. japonicus DNA ($33.93{\pm}0.479$) for confirming B. albus. The methodology was validated by using ten commercial samples. The genomic DNA-based molecular technique that used the real-time PCR was a reliable method for the taxonomic classification of animal tissues.