• Title/Summary/Keyword: Data Models

Search Result 13,916, Processing Time 0.039 seconds

Semi-supervised Model for Fault Prediction using Tree Methods (트리 기법을 사용하는 세미감독형 결함 예측 모델)

  • Hong, Euyseok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.4
    • /
    • pp.107-113
    • /
    • 2020
  • A number of studies have been conducted on predicting software faults, but most of them have been supervised models using labeled data as training data. Very few studies have been conducted on unsupervised models using only unlabeled data or semi-supervised models using enough unlabeled data and few labeled data. In this paper, we produced new semi-supervised models using tree algorithms in the self-training technique. As a result of the model performance evaluation experiment, the newly created tree models performed better than the existing models, and CollectiveWoods, in particular, outperformed other models. In addition, it showed very stable performance even in the case with very few labeled data.

Comparative Analysis of 3D Spatial Data Models (3차원 공간정보 데이터 모델 비교 분석)

  • Park, Se-Ho;Lee, Ji-Yeong
    • Spatial Information Research
    • /
    • v.17 no.3
    • /
    • pp.277-285
    • /
    • 2009
  • Each system should have a suitable data model about their purpose for efficiently managing, analyzing, and manipulating data. And the usable range of application is determined by the data model, and suitable data models are being developed for each application. In GIS, diversity spatial data model is being developed too. The accuracy and update of the spatial data would be important for applying efficient application as well as the data modeling is important as constructing the spatial data structure. Therefore, the purposes of this research are to 1)compare domestic spatial data models with oversea spatial data models about their geometry model, topology model and visualizing method of 3D spatial data 2)to compare the features of the data model by analyzing each data structures. We 3)compare and analyze features of each spatial data models via the quantitative analysis of each spatial data models.

  • PDF

ADVANTAGES OF USING ARTIFICIAL NEURAL NETWORKS CALIBRATION TECHNIQUES TO NEAR-INFRARED AGRICULTURAL DATA

  • Buchmann, Nils-Bo;Ian A.Cowe
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1032-1032
    • /
    • 2001
  • Artificial Neural Network (ANN) calibration techniques have been used commercially for agricultural applications since the mid-nineties. Global models, based on transmission data from 850 to 1050 nm, are used routinely to measure protein and moisture in wheat and barley and also moisture in triticale, rye, and oats. These models are currently used commercially in approx. 15 countries throughout the world. Results concerning earlier European ANN models are being published elsewhere. Some of the findings from that study will be discussed here. ANN models have also been developed for coarsely ground samples of compound feed and feed ingredients, again measured in transmission mode from 850 to 1050 nm. The performance of models for pig- and poultry feed will be discussed briefly. These models were developed from a very large data set (more than 20,000 records), and cover a very broad range of finished products. The prediction curves are linear over the entire range for protein, fat moisture, fibre, and starch (measured only on poultry feed), and accuracy is in line with the performance of smaller models based on Partial Least Squares (PLS). A simple bias adjustment is sufficient for calibration transfer across instruments. Recently, we have investigated the possible use of ANN for a different type of NIR spectrometer, based on reflectance data from 1100 to 2500 nm. In one study, based on data for protein, fat, and moisture measured on unground compound feed samples, dedicated ANN models for specific product classes (cattle feed, pig feed, broiler feed, and layers feed) gave moderately better Standard Errors of Prediction (SEP) compared to modified PLS (MPLS). However, if the four product classes were combined into one general calibration model, the performance of the ANN model deteriorated only slightly compared to the class-specific models, while the SEP values for the MPLS predictions doubled. Brix value in molasses is a measure of sugar content. Even with a huge dataset, PLS models were not sufficiently accurate for commercial use. In contrast an ANN model based on the same data improved the accuracy considerably and straightened out non-linearity in the prediction plot. The work of Mr. David Funk (GIPSA, U. S. Department of Agriculture) who has studied the influence of various types of spectral distortions on ANN- and PLS models, thereby providing comparative information on the robustness of these models towards instrument differences, will be discussed. This study was based on data from different classes of North American wheat measured in transmission from 850 to 1050 nm. The distortions studied included the effect of absorbance offset pathlength variation, presence of stray light bandwidth, and wavelength stretch and offset (either individually or combined). It was shown that a global ANN model was much less sensitive to most perturbations than class-specific GIPSA PLS calibrations. It is concluded that ANN models based on large data sets offer substantial advantages over PLS models with respect to accuracy, range of materials that can be handled by a single calibration, stability, transferability, and sensitivity to perturbations.

  • PDF

Prediction of concrete spall damage under blast: Neural approach with synthetic data

  • Dauji, Saha
    • Computers and Concrete
    • /
    • v.26 no.6
    • /
    • pp.533-546
    • /
    • 2020
  • The prediction of spall response of reinforced concrete members like columns and slabs have been attempted by earlier researchers with analytical solutions, as well as with empirical models developed from data generated from physical or numerical experiments, with different degrees of success. In this article, compared to the empirical models, more versatile and accurate models are developed based on model-free approach of artificial neural network (ANN). Synthetic data extracted from the results of numerical experiments from literature have been utilized for the purpose of training and testing of the ANN models. For two concrete members, namely, slabs and columns, different sets of ANN models were developed, each of which proved to have definite advantages over the corresponding empirical model reported in literature. In case of slabs, for all three categories of spall, the ANN model results were superior to the empirical models as evaluated by the various performance metrics, such as correlation, root mean square error, mean absolute error, maximum overestimation and maximum underestimation. The ANN models for each category of column spall could handle three variables together: namely, depth, spacing of longitudinal and transverse reinforcement, as contrasted to the empirical models that handled one variable at a time, and at the same time yielded comparable performance. The application of the ANN models for spall prediction of concrete slabs and columns developed in this study has been discussed along with their limitations.

A review and comparison of convolution neural network models under a unified framework

  • Park, Jimin;Jung, Yoonsuh
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.161-176
    • /
    • 2022
  • There has been active research in image classification using deep learning convolutional neural network (CNN) models. ImageNet large-scale visual recognition challenge (ILSVRC) (2010-2017) was one of the most important competitions that boosted the development of efficient deep learning algorithms. This paper introduces and compares six monumental models that achieved high prediction accuracy in ILSVRC. First, we provide a review of the models to illustrate their unique structure and characteristics of the models. We then compare those models under a unified framework. For this reason, additional devices that are not crucial to the structure are excluded. Four popular data sets with different characteristics are then considered to measure the prediction accuracy. By investigating the characteristics of the data sets and the models being compared, we provide some insight into the architectural features of the models.

Rapid Estimation of the Aerodynamic Coefficients of a Missile via Co-Kriging (코크리깅을 활용한 신속한 유도무기 공력계수 추정)

  • Kang, Shinseong;Lee, Kyunghoon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.48 no.1
    • /
    • pp.13-21
    • /
    • 2020
  • Surrogate models have been used for the rapid estimation of six-DOF aerodynamic coefficients in the context of the design and control of a missile. For this end, we may generate highly accurate surrogate models with a multitude of aerodynamic data obtained from wind tunnel tests (WTTs); however, this approach is time-consuming and expensive. Thus, we aim to swiftly predict aerodynamic coefficients via co-Kriging using a few WTT data along with plenty of computational fluid dynamics (CFD) data. To demonstrate the excellence of co-Kriging models based on both WTT and CFD data, we first generated two surrogate models: co-Kriging models with CFD data and Kriging models without the CFD data. Afterwards, we carried out numerical validation and examined predictive trends to compare the two different surrogate models. As a result, we found that the co-Kriging models produced more accurate aerodynamic coefficients than the Kriging models thanks to the assistance of CFD data.

Analyzing Customer Management Data by Data Mining: Case Study on Chum Prediction Models for Insurance Company in Korea

  • Cho, Mee-Hye;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1007-1018
    • /
    • 2008
  • The purpose of this case study is to demonstrate database-marketing management. First, we explore original variables for insurance customer's data, modify them if necessary, and go through variable selection process before analysis. Then, we develop churn prediction models using logistic regression, neural network and SVM analysis. We also compare these three data mining models in terms of misclassification rate.

  • PDF

DEVELOPMENT OF ARTIFICIAL NEURAL NETWORK MODELS SUPPORTING RESERVOIR OPERATION FOR THE CONTROL OF DOWNSTREAM WATER QUALITY

  • Chung, Se-Woong;Kim, Ju-Hwan
    • Water Engineering Research
    • /
    • v.3 no.2
    • /
    • pp.143-153
    • /
    • 2002
  • As the natural flows in rivers dramatically decrease during drought season in Korea, a deterioration of river water quality is accelerated. Thus, consideration of downstream water quality responding to changes in reservoir release is essential for an integrated watershed management with regards to water quantity and quality. In this study, water quality models based on artificial neural networks (ANNs) method were developed using historical downstream water quality (rm $\NH_3$-N) data obtained from a water treatment plant in Geum river and reservoir release data from Daechung dam. A nonlinear multiple regression model was developed and compared with the ANN models. In the models, the rm NH$_3$-N concentration for next time step is dependent on dam outflow, river water quality data such as pH, alkalinity, temperature, and rm $\NH_3$-N of previous time step. The model parameters were estimated using monthly data from Jan. 1993 to Dec. 1998, then another set of monthly data between Jan. 1999 and Dec. 2000 were used for verification. The predictive performance of the models was evaluated by comparing the statistical characteristics of predicted data with those of observed data. According to the results, the ANN models showed a better performance than the regression model in the applied cases.

  • PDF

Validation Comparison of Credit Rating Models Using Box-Cox Transformation

  • Hong, Chong-Sun;Choi, Jeong-Min
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.3
    • /
    • pp.789-800
    • /
    • 2008
  • Current credit evaluation models based on financial data make use of smoothing estimated default ratios which are transformed from each financial variable. In this work, some problems of the credit evaluation models developed by financial experts are discussed and we propose improved credit evaluation models based on the stepwise variable selection method and Box-Cox transformed data whose distribution is much skewed to the right. After comparing goodness-of-fit tests of these models, the validation of the credit evaluation models using statistical methods such as the stepwise variable selection method and Box-Cox transformation function is explained.

  • PDF

Developing the Pedestrian Accident Models Using Tobit Model (토빗모형을 이용한 가로구간 보행자 사고모형 개발)

  • Lee, Seung Ju;Kim, Yun Hwan;Park, Byung Ho
    • International Journal of Highway Engineering
    • /
    • v.16 no.3
    • /
    • pp.101-107
    • /
    • 2014
  • PURPOSES : This study deals with the pedestrian accidents in case of Cheongju. The goals are to develop the pedestrian accident model. METHODS : To analyze the accident, count data models, truncated count data models and Tobit regression models are utilized in this study. The dependent variable is the number of accident. Independent variables are traffic volume, intersection geometric structure and the transportation facility. RESULTS : The main results are as follows. First, Tobit model was judged to be more appropriate model than other models. Also, these models were analyzed to be statistically significant. Second, such the main variables related to accidents as traffic volume, pedestrian volume, number of Entry/exit, number of crosswalk and bus stop were adopted in the above model. CONCLUSIONS : The optimal model for pedestrian accidents is evaluated to be Tobit model.