• 제목/요약/키워드: Bayesian Classification

검색결과 253건 처리시간 0.027초

비정상 문자 조합으로 구성된 스팸 메일의 탐지 방법 (An Approach to Detect Spam E-mail with Abnormal Character Composition)

  • 이호섭;조재익;정만현;문종섭
    • 정보보호학회논문지
    • /
    • 제18권6A호
    • /
    • pp.129-137
    • /
    • 2008
  • 인터넷의 활용도가 높아짐에 따라, 스팸메일이 전체 메일에서 차지하는 비중이 점점 커지게 되었다. 전체 인터넷 자원에서 필요에 의해 사용되는 메일의 기능보다, 주로 광고나 악성코드 등의 전파를 위한 목적으로 사용되는 메일의 비중이 점점 커지고 있으며, 이를 방지하기 위한 컴퓨터 및 네트워크, 인적자원의 소모가 매우 심각해지고 있다. 이를 해결하기 위해 스팸 메일 필터링에 대한 연구가 활발히 진행되어 왔으며, 현재는 문맥상의 의미는 없지만 가독상에서 의미를 해석할 수 있는 문장에 대한 연구가 활발히 이루어지고 있다. 이러한 방식의 메일은 기존의 어휘를 분석하거나 문서 분류 기법 등을 이용한 스팸 메일을 필터링 방법을 통해 분류하기 어렵다. 본 연구는 이와 같은 어려움을 해결하기 위해 메일의 제목에 대한 N-GRAM 색인화를 통해 베이지안 및 SVM 을 이용하여 스팸 메일을 필터링 하는 방법을 제안한다.

Differentiation among stability regimes of alumina-water nanofluids using smart classifiers

  • Daryayehsalameh, Bahador;Ayari, Mohamed Arselene;Tounsi, Abdelouahed;Khandakar, Amith;Vaferi, Behzad
    • Advances in nano research
    • /
    • 제12권5호
    • /
    • pp.489-499
    • /
    • 2022
  • Nanofluids have recently triggered a substantial scientific interest as cooling media. However, their stability is challenging for successful engagement in industrial applications. Different factors, including temperature, nanoparticles and base fluids characteristics, pH, ultrasonic power and frequency, agitation time, and surfactant type and concentration, determine the nanofluid stability regime. Indeed, it is often too complicated and even impossible to accurately find the conditions resulting in a stabilized nanofluid. Furthermore, there are no empirical, semi-empirical, and even intelligent scenarios for anticipating the stability of nanofluids. Therefore, this study introduces a straightforward and reliable intelligent classifier for discriminating among the stability regimes of alumina-water nanofluids based on the Zeta potential margins. In this regard, various intelligent classifiers (i.e., deep learning and multilayer perceptron neural network, decision tree, GoogleNet, and multi-output least squares support vector regression) have been designed, and their classification accuracy was compared. This comparison approved that the multilayer perceptron neural network (MLPNN) with the SoftMax activation function trained by the Bayesian regularization algorithm is the best classifier for the considered task. This intelligent classifier accurately detects the stability regimes of more than 90% of 345 different nanofluid samples. The overall classification accuracy and misclassification percent of 90.1% and 9.9% have been achieved by this model. This research is the first try toward anticipting the stability of water-alumin nanofluids from some easily measured independent variables.

Pruning and Learning Fuzzy Rule-Based Classifier

  • Kim, Do-Wan;Park, Jin-Bae;Joo, Young-Hoon
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2004년도 ICCAS
    • /
    • pp.663-667
    • /
    • 2004
  • This paper presents new pruning and learning methods for the fuzzy rule-based classifier. The structure of the proposed classifier is framed from the fuzzy sets in the premise part of the rule and the Bayesian classifier in the consequent part. For the simplicity of the model structure, the unnecessary features for each fuzzy rule are eliminated through the iterative pruning algorithm. The quality of the feature is measured by the proposed correctness method, which is defined as the ratio of the fuzzy values for a set of the feature values on the decision region to one for all feature values. For the improvement of the classification performance, the parameters of the proposed classifier are finely adjusted by using the gradient descent method so that the misclassified feature vectors are correctly re-categorized. The cost function is determined as the squared-error between the classifier output for the correct class and the sum of the maximum output for the rest and a positive scalar. Then, the learning rules are derived from forming the gradient. Finally, the fuzzy rule-based classifier is tested on two data sets and is found to demonstrate an excellent performance.

  • PDF

뇌파 스펙트럼 분석과 베이지안 접근법을 이용한 정서 분류 (Emotion Classification Using EEG Spectrum Analysis and Bayesian Approach)

  • 정성엽;윤현중
    • 산업경영시스템학회지
    • /
    • 제37권1호
    • /
    • pp.1-8
    • /
    • 2014
  • This paper proposes an emotion classifier from EEG signals based on Bayes' theorem and a machine learning using a perceptron convergence algorithm. The emotions are represented on the valence and arousal dimensions. The fast Fourier transform spectrum analysis is used to extract features from the EEG signals. To verify the proposed method, we use an open database for emotion analysis using physiological signal (DEAP) and compare it with C-SVC which is one of the support vector machines. An emotion is defined as two-level class and three-level class in both valence and arousal dimensions. For the two-level class case, the accuracy of the valence and arousal estimation is 67% and 66%, respectively. For the three-level class case, the accuracy is 53% and 51%, respectively. Compared with the best case of the C-SVC, the proposed classifier gave 4% and 8% more accurate estimations of valence and arousal for the two-level class. In estimation of three-level class, the proposed method showed a similar performance to the best case of the C-SVC.

Towards Effective Analysis and Tracking of Mozilla and Eclipse Defects using Machine Learning Models based on Bugs Data

  • Hassan, Zohaib;Iqbal, Naeem;Zaman, Abnash
    • Soft Computing and Machine Intelligence
    • /
    • 제1권1호
    • /
    • pp.1-10
    • /
    • 2021
  • Analysis and Tracking of bug reports is a challenging field in software repositories mining. It is one of the fundamental ways to explores a large amount of data acquired from defect tracking systems to discover patterns and valuable knowledge about the process of bug triaging. Furthermore, bug data is publically accessible and available of the following systems, such as Bugzilla and JIRA. Moreover, with robust machine learning (ML) techniques, it is quite possible to process and analyze a massive amount of data for extracting underlying patterns, knowledge, and insights. Therefore, it is an interesting area to propose innovative and robust solutions to analyze and track bug reports originating from different open source projects, including Mozilla and Eclipse. This research study presents an ML-based classification model to analyze and track bug defects for enhancing software engineering management (SEM) processes. In this work, Artificial Neural Network (ANN) and Naive Bayesian (NB) classifiers are implemented using open-source bug datasets, such as Mozilla and Eclipse. Furthermore, different evaluation measures are employed to analyze and evaluate the experimental results. Moreover, a comparative analysis is given to compare the experimental results of ANN with NB. The experimental results indicate that the ANN achieved high accuracy compared to the NB. The proposed research study will enhance SEM processes and contribute to the body of knowledge of the data mining field.

Microblogging Sentiment Investor, Return and Volatility in the COVID-19 Era: Indonesian Stock Exchange

  • FARISKA, Putri;NUGRAHA, Nugraha;PUTERA, Ika;ROHANDI, Mochamad Malik Akbar;FARISKA, Putri
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제8권3호
    • /
    • pp.61-67
    • /
    • 2021
  • The covid-19 pandemic scenario caused the most extensive economic shocks the world has experienced in decades. Maintaining financial performance and economic stability is essential during the pandemic period. In these conditions, where movement is severely restricted, media consumption is considered to be increasing. The social media platform is one of the media online used by the public as a source of information and also expressing their sentiment, including individual investors in the capital market as social media users. Twitter is one of the social media microblogging platforms used by individual investors to share their opinion and get information. This study aims to determine whether microblogging sentiment investors can predict the capital market during pandemics. To analyze microblogging sentiment investors, we classified sentiment using the phyton text mining algorithm and Naïve Bayesian text classification into level positive, negative, and neutral from November 2019 to November 2020. This study was on 68 listed companies on the Indonesia stock exchange. A Vector Autoregression and Impulse Response is applied to capture short and long-term impacts along with a causal relationship. We found that microblogging sentiment investor has a significant impact on stock returns and volatility and vice-versa. Also, the response due to shocks is convergent, and microblogging investors in Indonesia are categorized as a "news-watcher" investor.

신병 주특기교육 성취집단 예측모형 개발 (Development of newly recruited privates on-the-job Training Achievements Group Classification Model)

  • 곽기효;서용무
    • 한국국방경영분석학회지
    • /
    • 제33권2호
    • /
    • pp.101-113
    • /
    • 2007
  • 국방부에서 발표한 '국방개혁에 관한 법률'에 따라 2014년까지 현역병들에 대한 복무기간이 단계적으로 단축될 예정이다. 이에 따라 육군에서는 좀 더 효율적인 직무교육 방안의 일환으로 훈련병들에게 '차등제 교육'을 시행하고 있다. 이러한 차등제 교육의 효과를 향상시키기 위해서는 훈련병들의 예상 학업 성취도를 미리 예측하여 성취집단별로 차별화 된 교육과정을 거치게 하는 것이 매우 중요하다. 따라서 본 연구에서는 입교초기에 얻을 수 있는 신병들의 제한된 자료들만을 이용하여 그들의 예상 교육 성취집단을 예측하는 모형을 개발하였다. 본 모형의 목적 변수는 '성취집단'이며 '일반관리 인원' 및 '집중관리 인원'의 두 가지 값을 갖는다. 사용된 기법은 인공신경망(Neural Network) 모형, 의사결정나무(Decision Tree) 모형, SVM 모형, 그리고 Naive Bayesian모형 등 4가지 순수 모형과, 각각의 순수 모형을 k-means군집기법과 혼합한 4가지의 혼합모형 등 총 8개의 모형의 성능을 비교 분석하였다. 실험 결과 k-means군집기법과 인공신경망 기법을 혼합한 모형이 가장 좋은 예측력을 보이는 것으로 나타났다. 이러한 교육 성취집단 예측 모형은 향후 군에서 이루어지는 다양한 교육 프로그램에 효과적으로 이용될 수 있을 것으로 기대된다.

Bayesian Survival Analysis of High-Dimensional Microarray Data for Mantle Cell Lymphoma Patients

  • Moslemi, Azam;Mahjub, Hossein;Saidijam, Massoud;Poorolajal, Jalal;Soltanian, Ali Reza
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제17권1호
    • /
    • pp.95-100
    • /
    • 2016
  • Background: Survival time of lymphoma patients can be estimated with the help of microarray technology. In this study, with the use of iterative Bayesian Model Averaging (BMA) method, survival time of Mantle Cell Lymphoma patients (MCL) was estimated and in reference to the findings, patients were divided into two high-risk and low-risk groups. Materials and Methods: In this study, gene expression data of MCL patients were used in order to select a subset of genes for survival analysis with microarray data, using the iterative BMA method. To evaluate the performance of the method, patients were divided into high-risk and low-risk based on their scores. Performance prediction was investigated using the log-rank test. The bioconductor package "iterativeBMAsurv" was applied with R statistical software for classification and survival analysis. Results: In this study, 25 genes associated with survival for MCL patients were identified across 132 selected models. The maximum likelihood estimate coefficients of the selected genes and the posterior probabilities of the selected models were obtained from training data. Using this method, patients could be separated into high-risk and low-risk groups with high significance (p<0.001). Conclusions: The iterative BMA algorithm has high precision and ability for survival analysis. This method is capable of identifying a few predictive variables associated with survival, among many variables in a set of microarray data. Therefore, it can be used as a low-cost diagnostic tool in clinical research.

베이지안 네트워크를 이용한 상황정보에 기반을 둔 소셜커머스 음식 쿠폰 추천시스템 (Social Commerce Food Coupon Recommending System Based On Context Information Using Bayesian Network)

  • 정현주;이상용
    • 디지털융복합연구
    • /
    • 제11권3호
    • /
    • pp.389-395
    • /
    • 2013
  • 최근 SNS를 활용한 소셜커머스를 통해 식음료 쿠폰의 거래가 활발하게 이루어지고 있다. 소셜커머스 상에서 식음료 쿠폰을 구매하여 사용할 경우 원하는 상품을 할인된 가격으로 이용할 수 있으나, 쿠폰 구입 시 매장의 위치, 이용 가능 기간 및 시간, 할인율 등을 구매자가 직접 비교하여 선택해야 하는 어려움이 있다. 따라서 본 논문에서는 사용자의 위치 및 시간과 구매 이력 등의 상황정보를 고려하여 사용자에게 적합한 소셜커머스 상의 식음료 쿠폰을 추천하는 시스템을 제안한다. 이를 위해 사용자의 상황 인지 및 지속적인 사용자 성향 반영을 위해 베이지안 네트워크 기반의 쿠폰 추천 방법을 제안한다. 또한 사용자가 선호하는 쿠폰 선택 기준에 대한 개인화된 가중치를 반영하기 위해 AHP를 이용하여 선호도의 가중치를 측정하고 분류를 수행한다. 시스템의 효율성을 검증을 하기 위해 12명의 학생을 대상으로 1개월간 20회에 걸쳐 실험을 수행하였으며 그 결과 80%의 추천 만족도를 얻을 수 있었다.

소셜미디어 감성분석을 위한 베이지안 속성 선택과 분류에 대한 연구 (Investigating the Performance of Bayesian-based Feature Selection and Classification Approach to Social Media Sentiment Analysis)

  • 강창민;어균선;이건창
    • 경영정보학연구
    • /
    • 제24권1호
    • /
    • pp.1-19
    • /
    • 2022
  • 온라인 사용자들이 소셜 미디어상에 올린 온라인 리뷰 속 숨겨진 감정을 분석하는 감성분석은 소셜미디어의 확산에 힘입어 많은 관심을 받고 있다. 본 연구는 기존 연구들과 차별화된 방법으로 감성분석을 시도하기 위하여 베이지안 네트워크에 기반한 감성 분석 모델을 제안한다. 모델에는 MBFS(Markov Blanket-based Feature Selection)가 속성 선택 기법으로 사용된다. MBFS의 성과를 실증적으로 증명하기 위하여 소셜미디어인 Yelp의 리뷰 데이터를 활용하였다. 벤치마킹 속성 선택 기법으로는 상관관계기반 속성 선택, 정보획득 속성 선택, 획득비율 속성 선택을 사용하였다. 한편, 해당 속성선택방법을 토대로 4개의 머신러닝 알고리즘을 이용하여 분류성과를 비교하였다. 나아가 MBFS로 선택된 속성들 간 인과관계를 확인하고자 베이지안 네트워크를 통해 What-if 분석을 실시하였다. 본 연구에서 택한 머신러닝 분류기는 베이지안 네트워크 기반의 TAN (Tree Augmented Naive Bayes), NB (Naive Bayes), S-Spouses(Sons & Spouses), A-markov (Augmented Markov Blanket)이다. 성과분석 결과 본 연구에서 제안한 MBFS 방법이 정확도, 정밀도, F1점수 측면에서 벤치마킹 방법보다 더 우수한 성과를 나타내었다.