• Title/Summary/Keyword: 나이브 베이지안

Search Result 54, Processing Time 0.025 seconds

Performance Evaluation on the Learning Algorithm for Automatic Classification of Q&A Documents (고객 질의 문서 자동 분류를 위한 학습 알고리즘 성능 평가)

  • Choi Jung-Min;Lee Byoung-Soo
    • The KIPS Transactions:PartD
    • /
    • v.13D no.1 s.104
    • /
    • pp.133-138
    • /
    • 2006
  • Electric commerce of surpassing the traditional one appeared before the public and has currently led the change in the management of enterprises. To establish and maintain good relations with customers, electric commerce has various channels for customers that understand what they want to and suggest it to them. The bulletin board and e-mail among em are inbound information that enterprises can directly listen to customers' opinions and are different from other channels in characters. Enterprises can effectively manage the bulletin board and e-mail by understanding customers' ideas as many as possible and provide them with optimum answers. It is one of the important factors to improve the reliability of the notice board and e-mail as well as the whole electric commerce. Therefore this thesis researches into methods to classify various kinds of documents automatically in electric commerce; they are possible to solve existing problems of the bulletin board and e-mail, to operate effectively and to manage systematically. Moreover, it researches what the most suitable algorithm is in the automatic classification of Q&A documents by experiment the classifying performance of Naive Bayesian, TFIDF, Neural Network, k-NN

Spammer Detection using Features based on User Relationships in Twitter (관계 기반 특징을 이용한 트위터 스패머 탐지)

  • Lee, Chansik;Kim, Juntae
    • Journal of KIISE
    • /
    • v.41 no.10
    • /
    • pp.785-791
    • /
    • 2014
  • Twitter is one of the most famous SNS(Social Network Service) in the world. Twitter spammer accounts that are created easily by E-mail authentication deliver harmful content to twitter users. This paper presents a spammer detection method that utilizes features based on the relationship between users in twitter. Relationship-based features include friends relationship that represents user preferences and type relationship that represents similarity between users. We compared the performance of the proposed method and conventional spammer detection method on a dataset with 3% to 30% spammer ratio, and the experimental results show that proposed method outperformed conventional method in Naive Bayesian Classification and Decision Tree Learning.

Features Reduction using Logistic Regression for Spam Filtering (로지스틱 회귀 분석을 이용한 스펨 필터링의 특징 축소)

  • Jung, Yong-Gyu;Lee, Bum-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.2
    • /
    • pp.13-18
    • /
    • 2010
  • Today, The much amount of spam that occupies the mail server and network storage occurs the lack of negative issues, such as overload, and for users to delete the spam should spend time, resources have a problem. Automatic spam filtering on the incidence to solve the problem is essential. A lot of Spam filters have tried to solve the problem emerged as an essential element automatically. Unlike traditional method such as Naive Bayesian, PCA through the many-dimensional data set of spam with a few spindle-dimensional process that narrowed the operation to reduce the burden on certain groups for classification Logistic regression analysis method was used to filter the spam. Through the speed and performance, it was able to get the positive results.

Chaff Echo Detecting and Removing Method using Naive Bayesian Network (나이브 베이지안 네트워크를 이용한 채프에코 탐지 및 제거 방법)

  • Lee, Hansoo;Yu, Jungwon;Park, Jichul;Kim, Sungshin
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.10
    • /
    • pp.901-906
    • /
    • 2013
  • Chaff is a kind of matter spreading atmosphere with the purpose of preventing aircraft from detecting by radar. The chaff is commonly composed of small aluminum pieces, metallized glass fiber, or other lightweight strips which consists of reflecting materials. The chaff usually appears on the radar images as narrow bands shape of highly reflective echoes. And the chaff echo has similar characteristics to precipitation echo, and it interrupts weather forecasting process and makes forecasting accuracy low. In this paper, the chaff echo recognizing and removing method is suggested using Bayesian network. After converting coordinates from spherical to Cartesian in UF (Universal Format) radar data file, the characteristics of echoes are extracted by spatial and temporal clustering. And using the data, as a result of spatial and temporal clustering, a classification process for analyzing is performed. Finally, the inference system using Bayesian network is applied. As a result of experiments with actual radar data in real chaff echo appearing case, it is confirmed that Bayesian network can distinguish between chaff echo and non-chaff echo.

Personalized Activity Recognizer and Logger in Smart Phone Environment (스마트폰 환경에서 개인화된 행위 인식기 및 로거)

  • Cho, Geumhwan;Han, Manhyung;Lee, Ho Sung;Lee, Sungyoung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2012.07a
    • /
    • pp.65-68
    • /
    • 2012
  • 본 논문에서는 최근 활발히 연구가 진행되고 있는 행위인식 연구 분야 중에서 스마트폰 환경에서의 개인화된 행위 인식기 및 로거를 제안한다. 최근 스마트폰의 보급이 활발해지면서 행위 인식 연구 분야에서 스마트폰을 이용하는 연구가 활발히 진행되고 있다. 그러나 스마트폰에서는 센서를 이용하여 행위정보를 수집하고, 서버에서 는 분류 및 처리하는 방식으로 실시간 인식과 개발자에 의한 트레이닝으로 인해 개인화된 트레이닝이 불가능하다는 단점이 있다. 이러한 단점을 극복하고자 Naive Bayes Classifier를 사용하여 스마트폰 환경에서 실시간으로 사용자 행위 수집이 가능하고 행위정보의 분류 및 처리가 가능한 경량화 및 개인화된 행위 인식기 및 로거의 구현을 목적으로 한다. 제안하는 방법은 행위 인식기를 통해 행위 인식이 가능할 뿐만 아니라 로거를 통해 사용자의 라이프로그, 라이프패턴 등의 연구 분야에 이용이 가능하다.

  • PDF

Collaborative Tag-based Filtering for Recommender Systems (효과적인 추천 시스템을 위한 협업적 태그 기반의 여과 기법)

  • Yeon, Cheol;Ji, Ae-Ttie;Kim, Heung-Nam;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.2
    • /
    • pp.157-177
    • /
    • 2008
  • Even in a single day, an enormous amount of content including digital videos, posts, photographs, and wikis are generated on the web. It's getting more difficult to recommend to a user what he/she prefers among these contents because of the difficulty of automatically grasping of content's meanings. CF (Collaborative Filtering) is one of useful methods to recommend proper content to a user under these situations because the filtering process is only based on historical information about whether or not a target user has preferred an item before. Collaborative Tagging is the process that allows many users to annotate content with descriptive tags. Recommendation using tags can partially improve, such as the limitations of CF, the sparsity and cold-start problem. In this research, a CF method with user-created tags is proposed. Collaborative tagging is employed to grasp and filter users' preferences for items. Empirical demonstrations using real dataset from del.icio.us show that our algorithm obtains improved performance, compared with existing works.

  • PDF

Exploring the Feature Selection Method for Effective Opinion Mining: Emphasis on Particle Swarm Optimization Algorithms

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.11
    • /
    • pp.41-50
    • /
    • 2020
  • Sentimental analysis begins with the search for words that determine the sentimentality inherent in data. Managers can understand market sentimentality by analyzing a number of relevant sentiment words which consumers usually tend to use. In this study, we propose exploring performance of feature selection methods embedded with Particle Swarm Optimization Multi Objectives Evolutionary Algorithms. The performance of the feature selection methods was benchmarked with machine learning classifiers such as Decision Tree, Naive Bayesian Network, Support Vector Machine, Random Forest, Bagging, Random Subspace, and Rotation Forest. Our empirical results of opinion mining revealed that the number of features was significantly reduced and the performance was not hurt. In specific, the Support Vector Machine showed the highest accuracy. Random subspace produced the best AUC results.

Human Gait-Phase Classification to Control a Lower Extremity Exoskeleton Robot (하지근력증강로봇 제어를 위한 착용자의 보행단계구분)

  • Kim, Hee-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39B no.7
    • /
    • pp.479-490
    • /
    • 2014
  • A lower extremity exoskeleton is a robot device that attaches to the lower limbs of the human body to augment or assist with the walking ability of the wearer. In order to improve the wearer's walking ability, the robot senses the wearer's walking locomotion and classifies it into a gait-phase state, after which it drives the appropriate robot motions for each state using its actuators. This paper presents a method by which the robot senses the wearer's locomotion along with a novel classification algorithm which classifies the sensed data as a gait-phase state. The robot determines its control mode using this gait-phase information. If erroneous information is delivered, the robot will fail to improve the walking ability or will bring some discomfort to the wearer. Therefore, it is necessary for the algorithm constantly to classify the correct gait-phase information. However, our device for sensing a human's locomotion has very sensitive characteristics sufficient for it to detect small movements. With only simple logic like a threshold-based classification, it is difficult to deliver the correct information continually. In order to overcome this and provide correct information in a timely manner, a probabilistic gait-phase classification algorithm is proposed. Experimental results demonstrate that the proposed algorithm offers excellent accuracy.

A proper folder recommendation technique using frequent itemsets for efficient e-mail classification (효과적인 이메일 분류를 위한 빈발 항목집합 기반 최적 이메일 폴더 추천 기법)

  • Moon, Jong-Pil;Lee, Won-Suk;Chang, Joong-Hyuk
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.2
    • /
    • pp.33-46
    • /
    • 2011
  • Since an e-mail has been an important mean of communication and information sharing, there have been much effort to classify e-mails efficiently by their contents. An e-mail has various forms in length and style, and words used in an e-mail are usually irregular. In addition, the criteria of an e-mail classification are subjective. As a result, it is quite difficult for the conventional text classification technique to be adapted to an e-mail classification efficiently. An e-mail classification technique in a commercial e-mail program uses a simple text filtering technique in an e-mail client. In the previous studies on automatic classification of an e-mail, the Naive Bayesian technique based on the probability has been used to improve the classification accuracy, and most of them are on an e-mail in English. This paper proposes the personalized recommendation technique of an email in Korean using a data mining technique of frequent patterns. The proposed technique consists of two phases such as the pre-processing of e-mails in an e-mail folder and the generating a profile for the e-mail folder. The generated profile is used for an e-mail to be classified into the most appropriate e-mail folder by the subjective criteria. The e-mail classification system is also implemented, which adapts the proposed technique.

Geographical Name Denoising by Machine Learning of Event Detection Based on Twitter (트위터 기반 이벤트 탐지에서의 기계학습을 통한 지명 노이즈제거)

  • Woo, Seungmin;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.10
    • /
    • pp.447-454
    • /
    • 2015
  • This paper proposes geographical name denoising by machine learning of event detection based on twitter. Recently, the increasing number of smart phone users are leading the growing user of SNS. Especially, the functions of short message (less than 140 words) and follow service make twitter has the power of conveying and diffusing the information more quickly. These characteristics and mobile optimised feature make twitter has fast information conveying speed, which can play a role of conveying disasters or events. Related research used the individuals of twitter user as the sensor of event detection to detect events that occur in reality. This research employed geographical name as the keyword by using the characteristic that an event occurs in a specific place. However, it ignored the denoising of relationship between geographical name and homograph, it became an important factor to lower the accuracy of event detection. In this paper, we used removing and forecasting, these two method to applied denoising technique. First after processing the filtering step by using noise related database building, we have determined the existence of geographical name by using the Naive Bayesian classification. Finally by using the experimental data, we earned the probability value of machine learning. On the basis of forecast technique which is proposed in this paper, the reliability of the need for denoising technique has turned out to be 89.6%.