• Title/Summary/Keyword: Classification Variables

Search Result 935, Processing Time 0.038 seconds

Classification of High Dimensionality Data through Feature Selection Using Markov Blanket

  • Lee, Junghye;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.14 no.2
    • /
    • pp.210-219
    • /
    • 2015
  • A classification task requires an exponentially growing amount of computation time and number of observations as the variable dimensionality increases. Thus, reducing the dimensionality of the data is essential when the number of observations is limited. Often, dimensionality reduction or feature selection leads to better classification performance than using the whole number of features. In this paper, we study the possibility of utilizing the Markov blanket discovery algorithm as a new feature selection method. The Markov blanket of a target variable is the minimal variable set for explaining the target variable on the basis of conditional independence of all the variables to be connected in a Bayesian network. We apply several Markov blanket discovery algorithms to some high-dimensional categorical and continuous data sets, and compare their classification performance with other feature selection methods using well-known classifiers.

A Study on Determination of Range of Hazardous Area Caused by the Secondary Grade of Release of Vapor Substances Considering Material Characteristic and Operating Condition (물질특성 및 운전조건을 고려한 증기상 물질의 2차 누출에 따른 폭발위험장소 범위 선정에 관한 연구)

  • Seo, Minsu;Kim, Kisug;Hwang, Yongwoo;Chon, Youngwoo
    • Journal of the Korean Institute of Gas
    • /
    • v.22 no.4
    • /
    • pp.13-26
    • /
    • 2018
  • Currently, local regulations, such as KS Code, do not clearly specify how to calculate the range of hazardous area, so the dispersion modeling program should be used to select dispersion. The purpose of this study is to present a methodology of determining the range of hazardous area which is simpler and more reasonable than modelling by using representative materials and process conditions. Based on domestic and overseas regulations that are currently in effect, variables affecting distance to LFL(Lower Flammable Limit) were selected. A total of 16 flammable substances were modelled for substance variables, process conditions variables, and weather conditions variables, and the statistical analysis selected the variables that affect them. Using the selected variables, a three-step classification method was prepared to select the range of locations subject to explosion hazard.

A Study on the Validity of the Questionnaire about Sasang Constitution Classification for Mongolians (몽고인(蒙古人)을 위한 사상체질분류검사지(四象體質分類檢査紙)의 타당화(妥當化) 연구(硏究))

  • Kim, Kyung-Su;Lee, Su-Kyung;Shin, Hyeun-Kyoo;Koh, Byung-Hee;Song, Il-Byung;Lee, Eui-Ju
    • Journal of Sasang Constitutional Medicine
    • /
    • v.19 no.1
    • /
    • pp.98-115
    • /
    • 2007
  • 1. Objectives This study focuses on the Validity of the Questionnaire about Sasang Constitution Classification for Mongolians 2. Methods By using the way of backward elimination, certain variables are chosen from the 438 cases whose physical conditions are absolutely diagnosed. After that, discriminant analysis for the selected variables has been done to obtain the physical constitution equation and the accuracy ratio of diagnosis which are useful for physical constitution diagnosis. 3. Results and Conclusions (1) In tile Validity for the Questionnaire of Sasang Constitution Classification for Mongolians, the accuracy ratio of diagnosis of Taeyangin is 100%, Soyangin 62.5%, Taeumin 76.7%, and Soeumin 66.1% respectively as a result of the discriminant analysis employing Cronbach's alpha coefficient. On the whole, the accuracy ratio of diagnosis is 70.1%. (2). In the Validity for the Questionnaire of Sasang Constitution Classification for Mongolians, the accuracy ratio of diagnosis of 70.1% means that it beats the maximum chance criterion of 41.4% and the proportional chance criterion of 34.4% by 28.7% and 35.7% respectively. Conclusively, this questionnaire has discriminant power.

  • PDF

Neural-network-based Fault Detection and Diagnosis Method Using EIV(errors-in variables) (EIV를 이용한 신경회로망 기반 고장진단 방법)

  • Han, Hyung-Seob;Cho, Sang-Jin;Chong, Ui-Pil
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.21 no.11
    • /
    • pp.1020-1028
    • /
    • 2011
  • As rotating machines play an important role in industrial applications such as aeronautical, naval and automotive industries, many researchers have developed various condition monitoring system and fault diagnosis system by applying artificial neural network. Since using obtained signals without preprocessing as inputs of neural network can decrease performance of fault classification, it is very important to extract significant features of captured signals and to apply suitable features into diagnosis system according to the kinds of obtained signals. Therefore, this paper proposes a neural-network-based fault diagnosis system using AR coefficients as feature vectors by LPC(linear predictive coding) and EIV(errors-in variables) analysis. We extracted feature vectors from sound, vibration and current faulty signals and evaluated the suitability of feature vectors depending on the classification results and training error rates by changing AR order and adding noise. From experimental results, we conclude that classification results using feature vectors by EIV analysis indicate more than 90 % stably for less than 10 orders and noise effect comparing to LPC.

Predicting and Interpreting Quality of CMP Process for Semiconductor Wafers Using Machine Learning (머신러닝을 이용한 반도체 웨이퍼 평탄화 공정품질 예측 및 해석 모형 개발)

  • Ahn, Jeong-Eon;Jung, Jae-Yoon
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.61-71
    • /
    • 2019
  • Chemical Mechanical Planarization (CMP) process that planarizes semiconductor wafer's surface by polishing is difficult to manage reliably since it is under various chemicals and physical machinery. In CMP process, Material Removal Rate (MRR) is often used for a quality indicator, and it is important to predict MRR in managing CMP process stably. In this study, we introduce prediction models using machine learning techniques of analyzing time-series sensor data collected in CMP process, and the classification models that are used to interpret process quality conditions. In addition, we find meaningful variables affecting process quality and explain process variables' conditions to keep process quality high by analyzing classification result.

  • PDF

Malware classification using statistical techniques (통계적 기법을 이용한 악성 소프트웨어 분류)

  • Won, Sungmin;Kim, Hyunjoo;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.851-865
    • /
    • 2017
  • Ransomware such as WannaCry is a global issue and methods to defend against malware attacks are important. We have to be able to classify the malware types efficiently in order to minimize the damage from malwares. This study makes models to classify malware properly with various statistical techniques. Several classification techniques such as logistic regression, random forest, gradient boosting, and support vector machine are used to construct models. This study also helps us understand key variables to classify the type of malicious software.

The Effect of the Fashion Product Classification Method in Online Shopping Sites (인터넷 쇼핑몰의 패션 제품 분류 방식의 효과)

  • Han, Seo-Young;Cho, Yunjin;Lee, Yuri
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.40 no.2
    • /
    • pp.287-304
    • /
    • 2016
  • This study examines the influence of product classification standards and structure on user perception as well as their attitude towards online shopping sites. The causal relationships of variables are also examined. The analysis was based on an online survey with 247 responses. Four types of internet shopping sites were developed and used as a stimulus. The results of the mean comparison analysis indicated that perceived variety, information overload, perceived shopping value and attitude towards the site varies significantly with product classification standards and structure. There was also of a marginally significant interaction between the classification standard and structure on perceived variety and information overload. The causal relationship analysis revealed that perceived variety positively influenced hedonic and utilitarian shopping value. However, information overload had a negative effect on hedonic and utilitarian shopping value. Both the hedonic and utilitarian shopping value positively influenced attitudes towards the sites. This study demonstrates that classification method influences customer perception and attitude. It offers interesting insights on a product classification method as a strategic tool for online shopping.

A Study on the Extraction of Feature Variables for the Pattern Recognition of Welding Flaws (용접결함의 형상인식을 위한 특징변수 추출에 관한 연구)

  • Kim, Jae-Yeol;Roh, Byung-Ok;You, Sin;Kim, Chang-Hyun;Ko, Myung-Soo
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.19 no.11
    • /
    • pp.103-111
    • /
    • 2002
  • In this study, the natural flaws in welding parts are classified using the signal pattern classification method. The storage digital oscilloscope including FFT function and enveloped waveform generator is used and the signal pattern recognition procedure is made up the digital signal processing, feature extraction, feature selection and classifier design. It is composed with and discussed using the distance classifier that is based on euclidean distance the empirical Bayesian classifier. feature extraction is performed using the class-mean scatter criteria. The signal pattern classification method is applied to the signal pattern recognition of natural flaws.

Typical Classification of Rural Area Considering Settlement Environment by Decision Tree Method (정주여건을 고려한 의사결정나무기법 활용 농촌지역 유형화)

  • Bae, Seung-Jong;Kim, Dae-Sik;Eun, Sang-Kyu
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.58 no.6
    • /
    • pp.79-92
    • /
    • 2016
  • The objective of this study is to classify the types of rural areas (138 $si{\cdot}gun$) considering settlement environment by Decision Tree Method (CHAID). The CHAID method was used for decision tree algorithm and the seven dependant variables and 5 explanatory variables were selected, respectively. By decision tree method, rural areas were finally classified into six groups through three separate processes. City area, lower area in aging rate and higher area in farmland area ratio was analyzed to be relatively rich rather than other area in the case of settlement environment index. In the future, this study will be able to utilize as a reference to the planning of rural development projects.

A shop recommendation learning with Tensorflow.js (Tensorflow.js를 활용한 상점 추천 학습)

  • Cho, Jaeyoung;Lee, Sangwon;Chung, Tai Myoung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.267-270
    • /
    • 2019
  • Through this research, the rating data of shops were analyzed. The model was designed for discrete multiple classification as to the corresponding data, and the following experiments were initiated to observe the learned machine. By comparing each benchmarks in the experiments, which contains different setting variables for the machine model, the hit ratio was measured which indicates how much it is matched with the expected label. By analyzing those results from each benchmarks, the model was redesigned one time during the research and the effects of each setting variables on this machine were clarified. Furthermore, the research result left the future works, which are related with how the learning could be improved and what should be designed in the further research.

  • PDF