• Title/Summary/Keyword: Statistical Model Validation

Search Result 268, Processing Time 0.038 seconds

Fault Diagnosis of Bearing Based on Convolutional Neural Network Using Multi-Domain Features

  • Shao, Xiaorui;Wang, Lijiang;Kim, Chang Soo;Ra, Ilkyeun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1610-1629
    • /
    • 2021
  • Failures frequently occurred in manufacturing machines due to complex and changeable manufacturing environments, increasing the downtime and maintenance costs. This manuscript develops a novel deep learning-based method named Multi-Domain Convolutional Neural Network (MDCNN) to deal with this challenging task with vibration signals. The proposed MDCNN consists of time-domain, frequency-domain, and statistical-domain feature channels. The Time-domain channel is to model the hidden patterns of signals in the time domain. The frequency-domain channel uses Discrete Wavelet Transformation (DWT) to obtain the rich feature representations of signals in the frequency domain. The statistic-domain channel contains six statistical variables, which is to reflect the signals' macro statistical-domain features, respectively. Firstly, in the proposed MDCNN, time-domain and frequency-domain channels are processed by CNN individually with various filters. Secondly, the CNN extracted features from time, and frequency domains are merged as time-frequency features. Lastly, time-frequency domain features are fused with six statistical variables as the comprehensive features for identifying the fault. Thereby, the proposed method could make full use of those three domain-features for fault diagnosis while keeping high distinguishability due to CNN's utilization. The authors designed massive experiments with 10-folder cross-validation technology to validate the proposed method's effectiveness on the CWRU bearing data set. The experimental results are calculated by ten-time averaged accuracy. They have confirmed that the proposed MDCNN could intelligently, accurately, and timely detect the fault under the complex manufacturing environments, whose accuracy is nearly 100%.

Prediction Model for Unpaid Customers Using Big Data (빅 데이터 기반의 체납 수용가 예측 모델)

  • Jeong, Jaean;Lee, Kyouhwan;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.7
    • /
    • pp.827-833
    • /
    • 2020
  • In this paper, to reduce the unpaid rate of local governments, the internal data elements affecting the arrears in Water-INFOS are searched through interviews with meter readers in certain local governments. Candidate data affecting arrears from national statistical data were derived. The influence of the independent variable on the dependent variable was sampled by examining the disorder of the dependent variable in the data set called information gain. We also evaluated the higher prediction rates of decision tree and logistic regression using n-fold cross-validation. The results confirmed that the decision tree can find more accurate customer payment patterns than logistic regression. In the process of developing an analysis algorithm model using machine learning, the optimal values of two environmental variables, the minimum number of data and the maximum purity, which directly affect the complexity and accuracy of the decision tree, are derived to improve the accuracy of the algorithm.

Development and Validation of Core Competency Scale For Graduate Students in the Field of Science and Engineering (이공계열 대학원생 핵심역량 진단도구 개발 및 타당화 연구: A연구중심대학 사례)

  • Bae, Sang Hoon;Cho, Eun Won;Han, Song Ie;Jeong, Yoo Ji;Kim, Kyeong Eon
    • Journal of Engineering Education Research
    • /
    • v.27 no.2
    • /
    • pp.35-50
    • /
    • 2024
  • The purpose of this study is to identify the core competencies of graduate students at A research university in the context of graduate education in science and engineering, and to develop and validate a diagnostic tool to measure them. To achieve the research objectives, first, 6 factors and 18 sub-competencies of core competencies were derived based on a review of domestic and foreign studies, cases of excellent research-centered overseas universities, and interviews with members of A University. Second, a theoretical model was constructed by deriving behavioral indicators based on the core competencies and sub-competencies, and a preliminary survey was conducted on 188 graduate students of University A to verify the statistical validity of the theoretical model. Results of exploratory and confirmatory factor analysis, the core competencies of graduate students at A research university consisted of 6 factors, 16 sub-competencies, and 77 items. Specifically, it included "Independent research capability(13 items)", "Social Entrepreneurship(10 items)", "Academic agility(15 items)", "Ingenious Challenges(15 items)", "Collegial Collaboration(9 items)", and "Mueunjae leadership(15 items)". This study contributes to the development of theories related to core competencies of graduate students in science and engineering, and has practical significance as a basis for a data-driven competency-based graduate education system.

Video-based Facial Emotion Recognition using Active Shape Models and Statistical Pattern Recognizers (Active Shape Model과 통계적 패턴인식기를 이용한 얼굴 영상 기반 감정인식)

  • Jang, Gil-Jin;Jo, Ahra;Park, Jeong-Sik;Seo, Yong-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.14 no.3
    • /
    • pp.139-146
    • /
    • 2014
  • This paper proposes an efficient method for automatically distinguishing various facial expressions. To recognize the emotions from facial expressions, the facial images are obtained by digital cameras, and a number of feature points were extracted. The extracted feature points are then transformed to 49-dimensional feature vectors which are robust to scale and translational variations, and the facial emotions are recognized by statistical pattern classifiers such Naive Bayes, MLP (multi-layer perceptron), and SVM (support vector machine). Based on the experimental results with 5-fold cross validation, SVM was the best among the classifiers, whose performance was obtained by 50.8% for 6 emotion classification, and 78.0% for 3 emotions.

Development of The Irregular Radial Pulse Detection Algorithm Based on Statistical Learning Model (통계적 학습 모형에 기반한 불규칙 맥파 검출 알고리즘 개발)

  • Bae, Jang-Han;Jang, Jun-Su;Ku, Boncho
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.5
    • /
    • pp.185-194
    • /
    • 2020
  • Arrhythmia is basically diagnosed with the electrocardiogram (ECG) signal, however, ECG is difficult to measure and it requires expert help in analyzing the signal. On the other hand, the radial pulse can be measured with easy and uncomplicated way in daily life, and could be suitable bio-signal for the recent untact paradigm and extensible signal for diagnosis of Korean medicine based on pulse pattern. In this study, we developed an irregular radial pulse detection algorithm based on a learning model and considered its applicability as arrhythmia screening. A total of 1432 pulse waves including irregular pulse data were used in the experiment. Three data sets were prepared with minimal preprocessing to avoid the heuristic feature extraction. As classification algorithms, elastic net logistic regression, random forest, and extreme gradient boosting were applied to each data set and the irregular pulse detection performances were estimated using area under the receiver operating characteristic curve based on a 10-fold cross-validation. The extreme gradient boosting method showed the superior performance than others and found that the classification accuracy reached 99.7%. The results confirmed that the proposed algorithm could be used for arrhythmia screening. To make a fusion technology integrating western and Korean medicine, arrhythmia subtype classification from the perspective of Korean medicine will be needed for future research.

Recent Research Trends of Process Monitoring Technology: State-of-the Art (공정 모니터링 기술의 최근 연구 동향)

  • Yoo, ChangKyoo;Choi, Sang Wook;Lee, In-Beum
    • Korean Chemical Engineering Research
    • /
    • v.46 no.2
    • /
    • pp.233-247
    • /
    • 2008
  • Process monitoring technology is able to detect the faults and the process changes which occur in a process unpredictably, which makes it possible to find the reasons of the faults and get rid of them, resulting in a stable process operation, high-quality product. Statistical process monitoring method based on data set has a main merit to be a tool which can easily supervise a process with the statistics and can be used in the analysis of process data if a high quality of data is given. Because a real process has the inherent characteristics of nonlinearity, non-Gaussianity, multiple operation modes, sensor faults and process changes, however, the conventional multivariate statistical process monitoring method results in inefficient results, the degradation of the supervision performances, or often unreliable monitoring results. Because the conventional methods are not easy to properly supervise the process due to their disadvantages, several advanced monitoring methods are developed recently. This review introduces the theories and application results of several remarkable monitoring methods, which are a nonlinear monitoring with kernel principle component analysis (KPCA), an adaptive model for process change, a mixture model for multiple operation modes and a sensor fault detection and reconstruction, in order to tackle the weak points of the conventional methods.

New Calibration Methods with Asymmetric Data

  • Kim, Sung-Su
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.4
    • /
    • pp.759-765
    • /
    • 2010
  • In this paper, two new inverse regression methods are introduced. One is a distance based method, and the other is a likelihood based method. While a model is fitted by minimizing the sum of squared prediction errors of y's and x's in the classical and inverse methods, respectively. In the new distance based method, we simultaneously minimize the sum of both squared prediction errors. In the likelihood based method, we propose an inverse regression with Arnold-Beaver Skew Normal(ABSN) error distribution. Using the cross validation method with an asymmetric real data set, two new and two existing methods are studied based on the relative prediction bias(RBP) criteria.

An Introduction to Logistic Regression: From Basic Concepts to Interpretation with Particular Attention to Nursing Domain

  • Park, Hyeoun-Ae
    • Journal of Korean Academy of Nursing
    • /
    • v.43 no.2
    • /
    • pp.154-164
    • /
    • 2013
  • Purpose: The purpose of this article is twofold: 1) introducing logistic regression (LR), a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, and 2) examining use and reporting of LR in the nursing literature. Methods: Text books on LR and research articles employing LR as main statistical analysis were reviewed. Twenty-three articles published between 2010 and 2011 in the Journal of Korean Academy of Nursing were analyzed for proper use and reporting of LR models. Results: Logistic regression from basic concepts such as odds, odds ratio, logit transformation and logistic curve, assumption, fitting, reporting and interpreting to cautions were presented. Substantial shortcomings were found in both use of LR and reporting of results. For many studies, sample size was not sufficiently large to call into question the accuracy of the regression model. Additionally, only one study reported validation analysis. Conclusion: Nursing researchers need to pay greater attention to guidelines concerning the use and reporting of LR models.

Application of Response Surface Methodology for Optimization of Lactic Acid Production Using Date Juice

  • Chauhan Kishor;Trivedi Ujjval;Patel K.C.
    • Journal of Microbiology and Biotechnology
    • /
    • v.16 no.9
    • /
    • pp.1410-1415
    • /
    • 2006
  • Media components, including date juice, sodium acetate, peptone, and $K_{2}HPO_4$, which were screened by Plackett-Burman fractional factorial design, were optimized for lactic acid production from date juice using the response surface method (RSM). Sodium acetate, peptone (p<0.0001), and $K_{2}HPO_4$ (p=0.0029) were highly significant in influencing the lactic acid production. Close correlationship between predicted and experimental values was observed. When the optimum values of the parameters obtained through RSM (25.0 g/l date sugar, 15.0 g/l sodium acetate, 19.1 g/l peptone, and 4.7 g/l $K_{2}HPO_4$) were applied, lactic acid production (22.7 g/l) increased by 50.33%, compared with unoptimized media (15.1 g/l). The subsequent validation experiments confirmed the validity of the statistical model.

Corporate credit rating prediction using support vector machines

  • Lee, Yong-Chan
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.571-578
    • /
    • 2005
  • Corporate credit rating analysis has drawn a lot of research interests in previous studies, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper applies support vector machines (SVMs) to the corporate credit rating problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, the researcher uses a grid-search technique using 5-fold cross-validation to find out the optimal parameter values of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM, the researcher compares its performance with those of multiple discriminant analysis (MDA), case-based reasoning (CBR), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.

  • PDF