• Title/Summary/Keyword: naive Bayes

Search Result 238, Processing Time 0.028 seconds

Object Detection and Classification Using Extended Descriptors for Video Surveillance Applications (비디오 감시 응용에서 확장된 기술자를 이용한 물체 검출과 분류)

  • Islam, Mohammad Khairul;Jahan, Farah;Min, Jae-Hong;Baek, Joong-Hwan
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.4
    • /
    • pp.12-20
    • /
    • 2011
  • In this paper, we propose an efficient object detection and classification algorithm for video surveillance applications. Previous researches mainly concentrated either on object detection or classification using particular type of feature e.g., Scale Invariant Feature Transform (SIFT) or Speeded Up Robust Feature (SURF) etc. In this paper we propose an algorithm that mutually performs object detection and classification. We combinedly use heterogeneous types of features such as texture and color distribution from local patches to increase object detection and classification rates. We perform object detection using spatial clustering on interest points, and use Bag of Words model and Naive Bayes classifier respectively for image representation and classification. Experimental results show that our combined feature is better than the individual local descriptor in object classification rate.

Project Failure Main Factors Analysis using Text Mining in Audit Evaluation (감리결과에 텍스트마이닝 기법을 적용한 프로젝트 실패 주요요인 분석)

  • Jang, Kyoungae;Jang, Seong Yong;Kim, Woo-Je
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.468-474
    • /
    • 2015
  • Corporations should make efforts to recognize the importance of projects, identify their failure factors, prevent risks in advance, and raise the success rates, because the corporations need to make quick responses to rapid external changes. There are some previous studies on success and failure factors of projects, however, most of them have limitations in terms of objectivity and quantitative analysis based on data gathering through surveys, statistical sampling and analysis. This study analyzes the failure factors of projects based on data mining to find problems with projects in an audit report, which is an objective project evaluation report. To do this, we identified the texts in the paragraph of suggestions about improvement. We made use of the superior classification algorithms in this study, which were NaiveBayes, SMO and J48. They were evaluated in terms of data of Recall and Precision after performing 10-fold-cross validation. In the identified texts, the failure factors of projects were analyzed so that they could be utilized in project implementation.

Automatic Identification of Database Workloads by using SVM Workload Classifier (SVM 워크로드 분류기를 통한 자동화된 데이터베이스 워크로드 식별)

  • Kim, So-Yeon;Roh, Hong-Chan;Park, Sang-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.4
    • /
    • pp.84-90
    • /
    • 2010
  • DBMS is used for a range of applications from data warehousing through on-line transaction processing. As a result of this demand, DBMS has continued to grow in terms of its size. This growth invokes the most important issue of manually tuning the performance of DBMS. The DBMS tuning should be adaptive to the type of the workload put upon it. But, identifying workloads in mixed database applications might be quite difficult. Therefore, a method is necessary for identifying workloads in the mixed database environment. In this paper, we propose a SVM workload classifier to automatically identify a DBMS workload. Database workloads are collected in TPC-C and TPC-W benchmark while changing the resource parameters. Parameters for SVM workload classifier, C and kernel parameter, were chosen experimentally. The experiments revealed that the accuracy of the proposed SVM workload classifier is about 9% higher than that of Decision tree, Naive Bayes, Multilayer perceptron and K-NN classifier.

Discovery of User Preference in Recommendation System through Combining Collaborative Filtering and Content based Filtering (협력적 여과와 내용 기반 여과의 병합을 통한 추천 시스템에서의 사용자 선호도 발견)

  • Ko, Su-Jeong;Kim, Jin-Su;Kim, Tae-Yong;Choi, Jun-Hyeog;Lee, Jung-Hyun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.6
    • /
    • pp.684-695
    • /
    • 2001
  • Recent recommender system uses a method of combining collaborative filtering system and content based filtering system in order to solve sparsity and first rater problem in collaborative filtering system. Collaborative filtering systems use a database about user preferences to predict additional topics. Content based filtering systems provide recommendations by matching user interests with topic attributes. In this paper, we describe a method for discovery of user preference through combining two techniques for recommendation that allows the application of machine learning algorithm. The proposed collaborative filtering method clusters user using genetic algorithm based on items categorized by Naive Bayes classifier and the content based filtering method builds user profile through extracting user interest using relevance feedback. We evaluate our method on a large database of user ratings for web document and it significantly outperforms previously proposed methods.

  • PDF

An Auto-blogging System based Context Model for Micro-blogging Service (마이크로 블로깅 서비스를 지원하기 위한 컨텍스트 모델 기반 자동 블로깅 시스템)

  • Park, Jae-Min;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.10 no.4
    • /
    • pp.341-346
    • /
    • 2012
  • Social network service is service that enables the human network to be built up on web. It is important to record users' information simply and establish the network with people based on the information to provide with the social network service effectively. But it is very troublesome work for the user to input his or her own information on the mobile environment. In this paper we suggested a system which classifies users' behavior using context and creates blogging sentences automatically after inferring the destination. For this, users' behavior is classified and the destination is inferred with the sequence matching method using Naive Bayes classification. Then sentences which are suitable for situation is created by arranging the processed context using the structure of 5W1H. The system was evaluated satisfaction degree by comparing the created sentences based on actually collected data with users' intension and got accuracy rate of 88.73%.

Clustering and classification to characterize daily electricity demand (시간단위 전력사용량 시계열 패턴의 군집 및 분류분석)

  • Park, Dain;Yoon, Sanghoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.395-406
    • /
    • 2017
  • The purpose of this study is to identify the pattern of daily electricity demand through clustering and classification. The hourly data was collected by KPS (Korea Power Exchange) between 2008 and 2012. The time trend was eliminated for conducting the pattern of daily electricity demand because electricity demand data is times series data. We have considered k-means clustering, Gaussian mixture model clustering, and functional clustering in order to find the optimal clustering method. The classification analysis was conducted to understand the relationship between external factors, day of the week, holiday, and weather. Data was divided into training data and test data. Training data consisted of external factors and clustered number between 2008 and 2011. Test data was daily data of external factors in 2012. Decision tree, random forest, Support vector machine, and Naive Bayes were used. As a result, Gaussian model based clustering and random forest showed the best prediction performance when the number of cluster was 8.

Effective Korean sentiment classification method using word2vec and ensemble classifier (Word2vec과 앙상블 분류기를 사용한 효율적 한국어 감성 분류 방안)

  • Park, Sung Soo;Lee, Kun Chang
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.133-140
    • /
    • 2018
  • Accurate sentiment classification is an important research topic in sentiment analysis. This study suggests an efficient classification method of Korean sentiment using word2vec and ensemble methods which have been recently studied variously. For the 200,000 Korean movie review texts, we generate a POS-based BOW feature and a feature using word2vec, and integrated features of two feature representation. We used a single classifier of Logistic Regression, Decision Tree, Naive Bayes, and Support Vector Machine and an ensemble classifier of Adaptive Boost, Bagging, Gradient Boosting, and Random Forest for sentiment classification. As a result of this study, the integrated feature representation composed of BOW feature including adjective and adverb and word2vec feature showed the highest sentiment classification accuracy. Empirical results show that SVM, a single classifier, has the highest performance but ensemble classifiers show similar or slightly lower performance than the single classifier.

Prediction model of peptic ulcer diseases in middle-aged and elderly adults based on machine learning (머신러닝 기반 중노년층의 기능성 위장장애 예측 모델 구현)

  • Lee, Bum Ju
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.289-294
    • /
    • 2020
  • Peptic ulcer disease is a gastrointestinal disorder caused by Helicobacter pylori infection and the use of nonsteroid anti-inflammatory drugs. While many studies have been conducted to find the risk factors of peptic ulcers, there are no studies on the suggestion of peptic ulcer prediction models for Koreans. Therefore, the purpose of this study is to implement peptic ulcer prediction model using machine learning based on demographic information, obesity information, blood information, and nutritional information for middle-aged and elderly people. For model building, wrapper-based variable selection method and naive Bayes algorithm were used. The classification accuracy of the female prediction model was the area under the receiver operating characteristics curve (AUC) of 0.712, and males showed an AUC of 0.674, which is lower than that of females. These results can be used for prediction and prevention of peptic ulcers in the middle and elderly people.

Feature Extraction of Web Document using Association Word Mining (연관 단어 마이닝을 사용한 웹문서의 특징 추출)

  • 고수정;최준혁;이정현
    • Journal of KIISE:Databases
    • /
    • v.30 no.4
    • /
    • pp.351-361
    • /
    • 2003
  • The previous studies to extract features for document through word association have the problems of updating profiles periodically, dealing with noun phrases, and calculating the probability for indices. We propose more effective feature extraction method which is using association word mining. The association word mining method, by using Apriori algorithm, represents a feature for document as not single words but association-word-vectors. Association words extracted from document by Apriori algorithm depend on confidence, support, and the number of composed words. This paper proposes an effective method to determine confidence, support, and the number of words composing association words. Since the feature extraction method using association word mining does not use the profile, it need not update the profile, and automatically generates noun phrase by using confidence and support at Apriori algorithm without calculating the probability for index. We apply the proposed method to document classification using Naive Bayes classifier, and compare it with methods of information gain and TFㆍIDF. Besides, we compare the method proposed in this paper with document classification methods using index association and word association based on the model of probability, respectively.

Predicting Bug Severity by utilizing Topic Model and Bug Report Meta-Field (토픽 모델과 버그 리포트 메타 필드를 이용한 버그 심각도 예측 방법)

  • Yang, Geunseok;Lee, Byungjeong
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.9
    • /
    • pp.616-621
    • /
    • 2015
  • Recently developed software systems have many components, and their complexity is thus increasing. Last year, about 375 bug reports in one day were reported to a software repository in Eclipse and Mozilla open source projects. With so many bug reports submitted, developers' time and efforts have increased unnecessarily. Since the bug severity is manually determined by quality assurance, project manager or other developers in the general bug fixing process, it is biased to them. They might also make a mistake on the manual decision because of the large number of bug reports. Therefore, in this study, we propose an approach of bug severity prediction to solve these problems. First, we find similar topics within a new bug report and reduce the candidate reports of the topic by using the meta field of the bug report. Next, we train the reduced reports by applying Naive Bayes Multinomial. Finally, we predict the severity of the new bug report. We compare our approach with other prediction algorithms by using bug reports in open source projects. The results show that our approach better predicts bug severity than other algorithms.