• Title/Summary/Keyword: Markov blanket

Search Result 15, Processing Time 0.024 seconds

Classification of High Dimensionality Data through Feature Selection Using Markov Blanket

  • Lee, Junghye;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.14 no.2
    • /
    • pp.210-219
    • /
    • 2015
  • A classification task requires an exponentially growing amount of computation time and number of observations as the variable dimensionality increases. Thus, reducing the dimensionality of the data is essential when the number of observations is limited. Often, dimensionality reduction or feature selection leads to better classification performance than using the whole number of features. In this paper, we study the possibility of utilizing the Markov blanket discovery algorithm as a new feature selection method. The Markov blanket of a target variable is the minimal variable set for explaining the target variable on the basis of conditional independence of all the variables to be connected in a Bayesian network. We apply several Markov blanket discovery algorithms to some high-dimensional categorical and continuous data sets, and compare their classification performance with other feature selection methods using well-known classifiers.

Development of Correlation Based Feature Selection Method by Predicting the Markov Blanket for Gene Selection Analysis

  • Adi, Made;Yun, Zhen;Keong, Kwoh-Chee
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.183-187
    • /
    • 2005
  • In this paper, we propose a heuristic method to select features using a Two-Phase Markov Blanket-based (TPMB) algorithm. The first phase, filtering phase, of TPMB algorithm works by filtering the obviously redundant features. A non-linear correlation method based on Information theory is used as a metric to measure the redundancy of a feature [1]. In second phase, approximating phase, the Markov Blanket (MB) of a system is estimated by employing the concept of cross entropy to identify the MB. We perform experiments on microarray data and report two popular dataset, AML-ALL [3] and colon tumor [4], in this paper. The experimental results show that the TPMB algorithm can significantly reduce the number of features while maintaining the accuracy of the classifiers.

  • PDF

Investigating the Performance of Bayesian-based Feature Selection and Classification Approach to Social Media Sentiment Analysis (소셜미디어 감성분석을 위한 베이지안 속성 선택과 분류에 대한 연구)

  • Chang Min Kang;Kyun Sun Eo;Kun Chang Lee
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.1-19
    • /
    • 2022
  • Social media-based communication has become crucial part of our personal and official lives. Therefore, it is no surprise that social media sentiment analysis has emerged an important way of detecting potential customers' sentiment trends for all kinds of companies. However, social media sentiment analysis suffers from huge number of sentiment features obtained in the process of conducting the sentiment analysis. In this sense, this study proposes a novel method by using Bayesian Network. In this model MBFS (Markov Blanket-based Feature Selection) is used to reduce the number of sentiment features. To show the validity of our proposed model, we utilized online review data from Yelp, a famous social media about restaurant, bars, beauty salons evaluation and recommendation. We used a number of benchmarking feature selection methods like correlation-based feature selection, information gain, and gain ratio. A number of machine learning classifiers were also used for our validation tasks, like TAN, NBN, Sons & Spouses BN (Bayesian Network), Augmented Markov Blanket. Furthermore, we conducted Bayesian Network-based what-if analysis to see how the knowledge map between target node and related explanatory nodes could yield meaningful glimpse into what is going on in sentiments underlying the target dataset.

Bayesian Network-based Data Analysis for Diagnosing Retinal Disease (망막 질환 진단을 위한 베이지안 네트워크에 기초한 데이터 분석)

  • Kim, Hyun-Mi;Jung, Sung-Hwan
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.3
    • /
    • pp.269-280
    • /
    • 2013
  • In this paper, we suggested the possibility of using an efficient classifier for the dependency analysis of retinal disease. First, we analyzed the classification performance and the prediction accuracy of GBN (General Bayesian Network), GBN with reduced features by Markov Blanket and TAN (Tree-Augmented Naive Bayesian Network) among the various bayesian networks. And then, for the first time, we applied TAN showing high performance to the dependency analysis of the clinical data of retinal disease. As a result of this analysis, it showed applicability in the diagnosis and the prediction of prognosis of retinal disease.

Features Reduction and Baysian Networks Learning for Efficient Medical Data Mining (효율적인 의료데이터마이닝을 위한 특징축소와 레이지안망 학습)

  • 정용규;김인철
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.258-265
    • /
    • 2002
  • 베이지안망은 기존의 방법에 비해 불확실한 상황에서도 지식을 표현하고 결론을 추론하는데 유용한 것으로 알려져 있다. 본 논문에서는 대표적인 베이지안망 분류기들을 제시하고, 동일 임상데이터에 대해 서로 다른 유형별 베이지안망 분류기들을 학습하였다. 베이지안망을 적용할 때 변수의 수가 많아짐에 따라 베이지안망의 구조를 학습하는데 탐색공간이 넓어져 어려움이 있다. 본 연구에서는 이런 탐색공간을 효율적으로 줄이기 위하여 클래스 노드의 Markov blanket에 속한 특징들로 집합을 축소하는 것을 제안하고, 실험을 통해 이 특징 축소방법이 베이지안망 분류기들의 성능을 높여 줄 수 있는지 알아보았다. 분류기들의 성능에서는 축소한 특징집합으로부터 얻은 베이지안망으로 확장한 나이브 베이지안망 분류기가 가장 우수한 정확도를 가짐을 실험을 통해 알 수 있었다.

  • PDF

Fast Conditional Independence-based Bayesian Classifier

  • Junior, Estevam R. Hruschka;Galvao, Sebastian D. C. de O.
    • Journal of Computing Science and Engineering
    • /
    • v.1 no.2
    • /
    • pp.162-176
    • /
    • 2007
  • Machine Learning (ML) has become very popular within Data Mining (KDD) and Artificial Intelligence (AI) research and their applications. In the ML and KDD contexts, two main approaches can be used for inducing a Bayesian Network (BN) from data, namely, Conditional Independence (CI) and the Heuristic Search (HS). When a BN is induced for classification purposes (Bayesian Classifier - BC), it is possible to impose some specific constraints aiming at increasing the computational efficiency. In this paper a new CI based approach to induce BCs from data is proposed and two algorithms are presented. Such approach is based on the Markov Blanket concept in order to impose some constraints and optimize the traditional PC learning algorithm. Experiments performed with the ALARM, as well as other six UCI and three artificial domains revealed that the proposed approach tends to execute fewer comparison tests than the traditional PC. The experiments also show that the proposed algorithms produce competitive classification rates when compared with both, PC and Naive Bayes.

An Analysis on Prediction of Computer Entertainment Behavior Using Bayesian Inference (베이지안 추론을 이용한 컴퓨터 오락추구 행동 예측 분석)

  • Lee, HyeJoo;Jung, EuiHyun
    • The Journal of Korean Association of Computer Education
    • /
    • v.21 no.3
    • /
    • pp.51-58
    • /
    • 2018
  • In order to analyze the prediction of the computer entertainment behavior, this study investigated the variables' interdependencies and their causal relations to the computer entertainment behavior using Bayesian inference with the Korean Children and Youth Panel Survey data. For the study, Markov blanket was extracted through General Bayesian Network and the degree of influences was investigated by changing the variables' probabilities. Results showed that the computer entertainment behavior was significantly changed depending on adjusting the values of the related variables; school learning act, smoking, taunting, fandom, and school rule. The results suggested that the Bayesian inference could be used in educational filed for predicting and adjusting the adolescents' computer entertainment behavior.

Bayesian Network Analysis for the Dynamic Prediction of Financial Performance Using Corporate Social Responsibility Activities (베이지안 네트워크를 이용한 기업의 사회적 책임활동과 재무성과)

  • Sun, Eun-Jung
    • Management & Information Systems Review
    • /
    • v.34 no.5
    • /
    • pp.71-92
    • /
    • 2015
  • This study analyzes the impact of Corporate Social Responsibility (CSR) activities on financial performances using Bayesian Network. The research tries to overcome the issues of the uniform assumption of a linear function between financial performance and CSR activities in multiple regression analysis widely used in previous studies. It is required to infer a causal relationship between activities of CSR which have an impact on the financial performances. Identifying the relationship would empower the firms to improve their financial performance by informing the decision makers about the different CSR activities that influence the financial performance of the firms. This research proposes General Bayesian Network (GBN) and presents Markov Blanket induced from GBN. It is empirically demonstrated that all the proposals presented in this study are statistically significant by the results of the research conducted by Korean Economic Justice Institute (KEJI) under Citizen's Coalition for Economic Justice (CCEJ) which investigated approximately 200 companies in Korea based on Korean Economic Justice Institute Index (KEJI index) from 2005 to 2011. The Bayesian Network to effectively infer the properties affecting financial performances through the probabilistic causal relationship. Moreover, I found that there is a causal relationship among CSR activities variable; that is Environment protection is related to Customer protection, Employee satisfaction, and firm size; Soundness is related to Total CSR Evaluation Score, Debt-Assets Ratio. Though the what-if analysis, I suggest to the sensitive factor among the explanatory variables.

  • PDF

An Empirical Study on the Churning Behavior through Bayesian Network Classifier and Business Process Modeling (베이지안 네트워크 분류와 비즈니스 프로세스 모델링을 통한 신용카드 회원 이탈에 관한 연구)

  • Lee, Kun-Chang;Lee, Keun-Young;Jo, Nam-Yong
    • Knowledge Management Research
    • /
    • v.10 no.4
    • /
    • pp.1-15
    • /
    • 2009
  • 국내에서 신용카드는 대표적인 지불 수단으로 정착되었으며 신용카드의 사용자와 신용카드의 발급 매수는 이미 포화상태에 도달해 있다. 이 같은 양적 성장은 정부의 신용카드 활성화 정책과 더불어 신용카드사 간의 과당 경쟁의 영향에 기인하고 있다. 신용차드의 사용층은 대부분의 성인 남녀로 확대되었으며, 특히 복수의 신용카드 소지자를 대상으로 자사가 발급한 신용차드를 사용하게 하기 위한 신용카드사 간의 경쟁이 치열한 상황이다. 이에 따라 신용카드사들이 경쟁사의 카드사용 회원을 자사의 회원으로 확보하는 젓이 불가피하며 마찬가지로 사용 중인 자사의 회원이 경쟁사로 이동하지 않도록 사전에 이탈 징후를 포착하여 유지 캠페인을 수행하는 것이 신용카드사 마케팅의 주요 활동이 되었다. 선행연구에서는 신용카드 회원의 이탈과 관련하여 다양한 데이터마이닝 기법을 이용한 이탈의 특성 분류 연구가 진행되었다. 본 연구는 회원 이탈에 영향을 주는 요인을 효과적으로 발견하기 위한 방법으로 베이지안 네트워크(Bayesian Network)를 활용한다. 특히, 베이지안 네트워크의 일종인 일반 베이지안 네트워크(General Bayesian Network)를 이용하여 회원의 이탈요인에 영향을 주는 요인들의 집합인 마코프 블랭킷(Makov Blanket)을 도출한다. 한편, 마코프 블랭킷에 포함된 변수를 이용해 민감도 분석을 수행하여 영향이 큰 요인을 찾아내고 이를 비즈니스 프로세스에 적용하여 실무적인 의의를 실증하고자 한다.

  • PDF

Optimal Mobility Management of PCNs Using Two Types of Cell Residence Time (이동 통신망에 있어서 새로운 셀 체류시간 모형화에 따른 최적 이동성 관리)

  • 홍정식;장인갑;이창훈
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.27 no.3
    • /
    • pp.59-74
    • /
    • 2002
  • This study investigates two basic operations of mobility management of PCNs (Personal Communication Networks), i.e., the location update and the paging of the mobile terminal. From the realistic consideration that a user either moves through several cells consecutively or stays in a cell with long time, we model the mobility pattern by introducing two types of CRT (Cell Residence Time). Mobility patterns of the mobile terminal are classified Into various ways by using the ratios of two types of CRT. Cost analysis is performed for distance-based and movement-based location update schemes combined with blanket polling paging and selective paging scheme. It is demonstrated that in a certain condition of mobility pattern and call arrival pattern, 2-state CRT model produces different optimal threshold and so, is more effective than IID ( Independently-Identically-Distributed) CRT model. An analytical model for the new CRT model is compact and easily extendable to the other location update schemes.