• Title/Summary/Keyword: Predictive Validation

Search Result 257, Processing Time 0.027 seconds

Mathematical Transformation Influencing Accuracy of Near Infrared Spectroscopy (NIRS) Calibrations for the Prediction of Chemical Composition and Fermentation Parameters in Corn Silage (수 처리 방법이 근적외선분광법을 이용한 옥수수 사일리지의 화학적 조성분 및 발효품질의 예측 정확성에 미치는 영향)

  • Park, Hyung-Soo;Kim, Ji-Hye;Choi, Ki-Choon;Kim, Hyeon-Seop
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • v.36 no.1
    • /
    • pp.50-57
    • /
    • 2016
  • This study was conducted to determine the effect of mathematical transformation on near infrared spectroscopy (NIRS) calibrations for the prediction of chemical composition and fermentation parameters in corn silage. Corn silage samples (n=407) were collected from cattle farms and feed companies in Korea between 2014 and 2015. Samples of silage were scanned at 1 nm intervals over the wavelength range of 680~2,500 nm. The optical data were recorded as log 1/Reflectance (log 1/R) and scanned in intact fresh condition. The spectral data were regressed against a range of chemical parameters using partial least squares (PLS) multivariate analysis in conjunction with several spectral math treatments to reduce the effect of extraneous noise. The optimum calibrations were selected based on the highest coefficients of determination in cross validation ($R^2{_{cv}}$) and the lowest standard error of cross validation (SECV). Results of this study revealed that the NIRS method could be used to predict chemical constituents accurately (correlation coefficient of cross validation, $R^2{_{cv}}$, ranging from 0.77 to 0.91). The best mathematical treatment for moisture and crude protein (CP) was first-order derivatives (1, 16, 16, and 1, 4, 4), whereas the best mathematical treatment for neutral detergent fiber (NDF) and acid detergent fiber (ADF) was 2, 16, 16. The calibration models for fermentation parameters had lower predictive accuracy than chemical constituents. However, pH and lactic acids were predicted with considerable accuracy ($R^2{_{cv}}$ 0.74 to 0.77). The best mathematical treatment for them was 1, 8, 8 and 2, 16, 16, respectively. Results of this experiment demonstrate that it is possible to use NIRS method to predict the chemical composition and fermentation quality of fresh corn silages as a routine analysis method for feeding value evaluation to give advice to farmers.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

Recommendation of Nitrogen Topdressing Rates at Panicle Initiation Stage of Rice Using Canopy Reflectance

  • Nguyen, Hung T.;Lee, Kyu-Jong;Lee, Byun-Woo
    • Journal of Crop Science and Biotechnology
    • /
    • v.11 no.2
    • /
    • pp.141-150
    • /
    • 2008
  • The response of grain yield(GY) and milled-rice protein content(PC) to crop growth status and nitrogen(N) rates at panicle initiation stage(PIS) is critical information for prescribing topdress N rate at PIS(Npi) for target GY and PC. Three split-split-plot experiments including various N treatments and rice cultivars were conducted in Experimental Farm, Seoul National University, Korea in 2003-2005. Shoot N density(SND, g N in shoot $m^{-2}$) and canopy reflectance were measured before N application at PIS, and GY, PC, and SND were measured at harvest. Data from the first two years(2003-2004) were used for calibrating the predictive models for GY, PC, and SND accumulated from PIS to harvest using SND at PIS and Npi by multiple stepwise regression. After that the calibrated models were used for calculating N requirement at PIS for each of nine plots based on the target PC of 6.8% and the values of SND at PIS that was estimated by canopy reflectance method in the 2005 experiment. The result showed that SND at PIS in combination with Npi were successful to predict GY, PC, and SND from PIS to harvest in the calibration dataset with the coefficients of determination ($R^2$) of 0.87, 0.73, and 0.82 and the relative errors in prediction(REP, %) of 5.5, 4.3, and 21.1%, respectively. In general, the calibrated model equations showed a little lower performance in calculating GY, PC, and SND in the validation dataset(data from 2005) but REP ranging from 3.3% for PC and 13.9% for SND accumulated from PIS to harvest was acceptable. Nitrogen rate prescription treatment(PRT) for the target PC of 6.8% reduced the coefficient of variation in PC from 4.6% in the fixed rate treatment(FRT, 3.6g N $m^{-2}$) to 2.4% in PRT and the average PC of PRT was 6.78%, being very close to the target PC of 6.8%. In addition, PRT increased GY by 42.1 $gm^{-2}$ while Npi increased by 0.63 $gm^{-2}$ compared to the FRT, resulting in high agronomic N-use efficiency of 68.8 kg grain from additional kg N. The high agronomic N-use efficiency might have resulted from the higher response of grain yield to the applied N in the prescribed N rate treatment because N rate was prescribed based on the crop growth and N status of each plot.

  • PDF

3D Displays: Development and Validation of Prediction Function of Object Size Perception as a Function of Depth (3D 디스플레이: 깊이에 따른 대상의 크기지각 예측함수 개발 및 타당화)

  • Shin, Yoon-Ho;Li, Hyung-Chul O.;Kim, Shin-Woo
    • Journal of Broadcast Engineering
    • /
    • v.17 no.2
    • /
    • pp.400-410
    • /
    • 2012
  • In recent years, 3D displays are used in many media including 3D movies, TV, mobile phones, and PC games. Although 3D displays provide realistic viewing experience as compared with 2D displays, they also carry issues such as visual fatigue or size distortion. Focusing on the latter, we developed prediction function of object size perception as a function of object depth in 3D display. In Experiment 1, subjects observed 3D square of a fixed size of varying depth, and manipulated 2D square to make it as large as the 3D square. Conversely, in Experiment 2, subjects observed 2D square of a fixed size, and manipulated 3D square of varying depth to make it as large as the 2D square. In both Experiments 1 and 2, we found that size perception of 3D square linearly changed depending on depth of the square, and the linear relationship between depth and size was identical in both experiments. The predictive regression function, which predicts object size perception based on object depth, obtained in this research will be very useful in the creation of 3D media contents.

Prospective validation of a novel dosing scheme for intravenous busulfan in adult patients undergoing hematopoietic stem cell transplantation

  • Cho, Sang-Heon;Lee, Jung-Hee;Lim, Hyeong-Seok;Lee, Kyoo-Hyung;Kim, Dae-Young;Choe, Sangmin;Bae, Kyun-Seop;Lee, Je-Hwan
    • The Korean Journal of Physiology and Pharmacology
    • /
    • v.20 no.3
    • /
    • pp.245-251
    • /
    • 2016
  • The objective of this study was to externally validate a new dosing scheme for busulfan. Thirty-seven adult patients who received busulfan as conditioning therapy for hematopoietic stem cell transplantation (HCT) participated in this prospective study. Patients were randomized to receive intravenous busulfan, either as the conventional dosage (3.2 mg/kg daily) or according to the new dosing scheme based on their actual body weight (ABW) ($23{\times}ABW^{0.5}mg\;daily$) targeting an area under the concentration-time curve (AUC) of $5924{\mu}M{\cdot}min$. Pharmacokinetic profiles were collected using a limited sampling strategy by randomly selecting 2 time points at 3.5, 5, 6, 7 or 22 hours after starting busulfan administration. Using an established population pharmacokinetic model with NONMEM software, busulfan concentrations at the available blood sampling times were predicted from dosage history and demographic data. The predicted and measured concentrations were compared by a visual predictive check (VPC). Maximum a posteriori Bayesian estimators were estimated to calculate the predicted AUC ($AUC_{PRED}$). The accuracy and precision of the $AUC_{PRED}$ values were assessed by calculating the mean prediction error (MPE) and root mean squared prediction error (RMSE), and compared with the target AUC of $5924{\mu}M{\cdot}min$. VPC showed that most data fell within the 95% prediction interval. MPE and RMSE of $AUC_{PRED}$ were -5.8% and 20.6%, respectively, in the conventional dosing group and -2.1% and 14.0%, respectively, in the new dosing scheme group. These findings demonstrated the validity of a new dosing scheme for daily intravenous busulfan used as conditioning therapy for HCT.

Application of Machine Learning to Predict Weight Loss in Overweight, and Obese Patients on Korean Medicine Weight Management Program (한의 체중 조절 프로그램에 참여한 과체중, 비만 환자에서의 머신러닝 기법을 적용한 체중 감량 예측 연구)

  • Kim, Eunjoo;Park, Young-Bae;Choi, Kahye;Lim, Young-Woo;Ok, Ji-Myung;Noh, Eun-Young;Song, Tae Min;Kang, Jihoon;Lee, Hyangsook;Kim, Seo-Young
    • The Journal of Korean Medicine
    • /
    • v.41 no.2
    • /
    • pp.58-79
    • /
    • 2020
  • Objectives: The purpose of this study is to predict the weight loss by applying machine learning using real-world clinical data from overweight and obese adults on weight loss program in 4 Korean Medicine obesity clinics. Methods: From January, 2017 to May, 2019, we collected data from overweight and obese adults (BMI≥23 kg/m2) who registered for a 3-month Gamitaeeumjowi-tang prescription program. Predictive analysis was conducted at the time of three prescriptions, and the expected reduced rate and reduced weight at the next order of prescription were predicted as binary classification (classification benchmark: highest quartile, median, lowest quartile). For the median, further analysis was conducted after using the variable selection method. The data set for each analysis was 25,988 in the first, 6,304 in the second, and 833 in the third. 5-fold cross validation was used to prevent overfitting. Results: Prediction accuracy was increased from 1st to 2nd and 3rd analysis. After selecting the variables based on the median, artificial neural network showed the highest accuracy in 1st (54.69%), 2nd (73.52%), and 3rd (81.88%) prediction analysis based on reduced rate. The prediction performance was additionally confirmed through AUC, Random Forest showed the highest in 1st (0.640), 2nd (0.816), and 3rd (0.939) prediction analysis based on reduced weight. Conclusions: The prediction of weight loss by applying machine learning showed that the accuracy was improved by using the initial weight loss information. There is a possibility that it can be used to screen patients who need intensive intervention when expected weight loss is low.

A Study on Factors of Management of Diabetes Mellitus using Data Mining (데이터 마이닝을 이용한 당뇨환자의 관리요인에 관한 연구)

  • Kim, Yoo-Mi;Chang, Dong-Min;Kim, Sung-Soo;Park, Il-Su;Kang, Sung-Hong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.5
    • /
    • pp.1100-1108
    • /
    • 2009
  • The Objectives: The purpose of this study is to identify the factors related to management of DM in Korea. Methods: The subjects selected by using data of National Health and Nutrition Survey(NHANS) in 2005 were 415 adults, aged 20 and older, and diagnosed with DM. This study used data mining algorithms. This study validated the predictive power of data mining algorithms by comparing the performance of logistic regression, decision tree, and Neural Network on the basic of validation, it was found that the model performance of decision tree was the best among the above three techniques. Result: First, awareness of DM was positively associated with age, residential area, and job. The most important factor of DM awareness is age. Awareness rate of DM with 52 age over is 76.1%. Among the ${\geq}52$ age group, an important factor is family history. Among patients who are 52 years or over with family history of DM, an important factor is job. The awareness rate of patients who are 52 age over, family, history of DM, and professionals is 95.0%. Second, treatment of DM was also positively associated with awareness, region, and job. The most important factor of DM treatment is DM awareness. Treatment rate of patients who are aware of DM is 84.8%. Among patients who have awareness of DM, an important factor is region. The awareness rate of patients who are aware of DM in rural area is 10.4%. Conclusion: Finally, the result of analysis suggest that DM management programs should consider group characteristic of DM patients.

Detection of Clavibacter michiganensis subsp. michiganensis Assisted by Micro-Raman Spectroscopy under Laboratory Conditions

  • Perez, Moises Roberto Vallejo;Contreras, Hugo Ricardo Navarro;Herrera, Jesus A. Sosa;Avila, Jose Pablo Lara;Tobias, Hugo Magdaleno Ramirez;Martinez, Fernando Diaz-Barriga;Ramirez, Rogelio Flores;Vazquez, Angel Gabriel Rodriguez
    • The Plant Pathology Journal
    • /
    • v.34 no.5
    • /
    • pp.381-392
    • /
    • 2018
  • Clavibacter michiganensis subsp. michiganesis (Cmm) is a quarantine-worthy pest in $M{\acute{e}}xico$. The implementation and validation of new technologies is necessary to reduce the time for bacterial detection in laboratory conditions and Raman spectroscopy is an ambitious technology that has all of the features needed to characterize and identify bacteria. Under controlled conditions a contagion process was induced with Cmm, the disease epidemiology was monitored. Micro-Raman spectroscopy ($532nm\;{\lambda}$ laser) technique was evaluated its performance at assisting on Cmm detection through its characteristic Raman spectrum fingerprint. Our experiment was conducted with tomato plants in a completely randomized block experimental design (13 plants ${\times}$ 4 rows). The Cmm infection was confirmed by 16S rDNA and plants showed symptoms from 48 to 72 h after inoculation, the evolution of the incidence and severity on plant population varied over time and it kept an aggregated spatial pattern. The contagion process reached 79% just 24 days after the epidemic was induced. Micro-Raman spectroscopy proved its speed, efficiency and usefulness as a non-destructive method for the preliminary detection of Cmm. Carotenoid specific bands with wavelengths at 1146 and $1510cm^{-1}$ were the distinguishable markers. Chemometric analyses showed the best performance by the implementation of PCA-LDA supervised classification algorithms applied over Raman spectrum data with 100% of performance in metrics of classifiers (sensitivity, specificity, accuracy, negative and positive predictive value) that allowed us to differentiate Cmm from other endophytic bacteria (Bacillus and Pantoea). The unsupervised KMeans algorithm showed good performance (100, 96, 98, 91 y 100%, respectively).

Comparison of α1-Antitrypsin, α1-Acid Glycoprotein, Fibrinogen and NOx as Indicator of Subclinical Mastitis in Riverine Buffalo (Bubalus bubalis)

  • Guha, Anirban;Guha, Ruby;Gera, Sandeep
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.26 no.6
    • /
    • pp.788-794
    • /
    • 2013
  • Mastitis set apart as clinical and sub clinical is a disease complex of dairy cattle, with sub clinical being the most important economically. Of late, laboratories showed interest in developing biochemical markers to diagnose sub clinical mastitis (SCM) in herds. Many workers reported noteworthy alternation of acute phase proteins (APPs) and nitric oxide, (measured as nitrate+nitrite = NOx) in milk due to intra-mammary inflammation. But, the literature on validation of these parameters as indicators of SCM, particularly in riverine milch buffalo (Bubalus bubalis) milk is inadequate. Hence, the present study focused on comparing several APPs viz. ${\alpha}_1$-anti trypsin, ${\alpha}_1$-acid glycoprotein, fibrinogen and NOx as indicators of SCM in buffalo milk. These components in milk were estimated using standardized analytical protocols. Somatic cell count (SCC) was done microscopically. Microbial culture was done on 5% ovine blood agar. Of the 776 buffaloes (3,096 quarters) sampled, only 347 buffaloes comprising 496 quarters were found positive for SCM i.e. milk culture showed growth in blood agar with $SCC{\geq}2{\times}10^5$ cells/ml of milk. The cultural examination revealed Gram positive bacteria as the most prevalent etiological agent. It was observed that ${\alpha}_1$-anti trypsin and NOx had a highly significant (p<0.01) increase in SCM milk, whereas, the increase of ${\alpha}_1$-acid glycoprotein in infected milk was significant (p<0.05). Fibrinogen was below detection level in both healthy and SCM milk. The percent sensitivity, specificity and accuracy, predictive values and likelihood ratios were calculated taking bacterial culture examination and $SCC{\geq}2{\times}10^5$ cells/ml of milk as the benchmark. Udder profile correlation coefficient was also used. Allowing for statistical and epidemiological analysis, it was concluded that ${\alpha}_1$-anti trypsin indicates SCM irrespective of etiology, whereas ${\alpha}_1$-acid glycoprotein better diagnosed SCM caused by gram positive bacteria. NOx did not prove to be a good indicator of SCM. It is recommended measuring both ${\alpha}_1$-anti trypsin and ${\alpha}_1$-acid glycoprotein in milk to diagnose SCM in buffalo irrespective of etiology.

Genetic characterisation of PPARG, CEBPA and RXRA, and their influence on meat quality traits in cattle

  • Goszczynski, Daniel Estanislao;Mazzucco, Juliana Papaleo;Ripoli, Maria Veronica;Villarreal, Edgardo Leopoldo;Rogberg-Munoz, Andres;Mezzadra, Carlos Alberto;Melucci, Lilia Magdalena;Giovambattista, Guillermo
    • Journal of Animal Science and Technology
    • /
    • v.58 no.4
    • /
    • pp.14.1-14.9
    • /
    • 2016
  • Background: Peroxisome proliferator-activated receptor gamma (PPARG), CCAAT/enhancer binding protein alpha (CEBPA) and retinoid X receptor alpha (RXRA) are nuclear transcription factors that play important roles in regulation of adipogenesis and fat deposition. The objectives of this study were to characterise the variability of these three candidate genes in a mixed sample panel composed of several cattle breeds with different meat quality, validate single nucleotide polymorphisms (SNPs) in a local crossbred population (Angus - Hereford - Limousin) and evaluate their effects on meat quality traits (backfat thickness, intramuscular fat content and fatty acid composition), supporting the association tests with bioinformatic predictive studies. Results: Globally, nine SNPs were detected in the PPARG and CEBPA genes within our mixed panel, including a novel SNP in the latter. Three of these nine, along with seven other SNPs selected from the Single Nucleotide Polymorphism database (SNPdb), including SNPs in the RXRA gene, were validated in the crossbred population (N = 260). After validation, five of these SNPs were evaluated for genotype effects on fatty acid content and composition. Significant effects were observed on backfat thickness and different fatty acid contents (P < 0.05). Some of these SNPs caused slight differences in mRNA structure stability and/or putative binding sites for proteins. Conclusions: PPARG and CEBPA showed low to moderate variability in our sample panel. Variations in these genes, along with RXRA, may explain part of the genetic variation in fat content and composition. Our results may contribute to knowledge about genetic variation in meat quality traits in cattle and should be evaluated in larger independent populations.