• 제목/요약/키워드: Network Mining

Search Result 1,036, Processing Time 0.025 seconds

Improvement of Network Intrusion Detection Rate by Using LBG Algorithm Based Data Mining (LBG 알고리즘 기반 데이터마이닝을 이용한 네트워크 침입 탐지율 향상)

  • Park, Seong-Chul;Kim, Jun-Tae
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.4
    • /
    • pp.23-36
    • /
    • 2009
  • Network intrusion detection have been continuously improved by using data mining techniques. There are two kinds of methods in intrusion detection using data mining-supervised learning with class label and unsupervised learning without class label. In this paper we have studied the way of improving network intrusion detection accuracy by using LBG clustering algorithm which is one of unsupervised learning methods. The K-means method, that starts with random initial centroids and performs clustering based on the Euclidean distance, is vulnerable to noisy data and outliers. The nonuniform binary split algorithm uses binary decomposition without assigning initial values, and it is relatively fast. In this paper we applied the EM(Expectation Maximization) based LBG algorithm that incorporates the strength of two algorithms to intrusion detection. The experimental results using the KDD cup dataset showed that the accuracy of detection can be improved by using the LBG algorithm.

  • PDF

Development of Forecasting Model for the Initial Sale of Apartment Using Data Mining: The Case of Unsold Apartment Complex in Wirye New Town (데이터 마이닝을 이용한 아파트 초기계약 예측모형 개발: 위례 신도시 미분양 아파트 단지를 사례로)

  • Kim, Ji Young;Lee, Sang-Kyeong
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.217-229
    • /
    • 2018
  • This paper aims at applying the data mining such as decision tree, neural network, and logistic regression to an unsold apartment complex in Wirye new town and developing the model forecasting the result of initial sale contract by house unit. Raw data are divided into training data and test data. The order of predictability in training data is neural network, decision tree, and logistic regression. On the contrary, the results of test data show that logistic regression is the best model. This means that logistic regression has more data adaptability than neural network which is developed as the model optimized for training data. Determinants of initial sale are the location of floor, direction, the location of unit, the proximity of electricity and generator room, subscriber's residential region and the type of subscription. This suggests that using two models together is more effective in exploring determinants of initial sales. This paper contributes to the development of convergence field by expanding the scope of data mining.

Development of a Knowledge Discovery System using Hierarchical Self-Organizing Map and Fuzzy Rule Generation

  • Koo, Taehoon;Rhee, Jongtae
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.431-434
    • /
    • 2001
  • Knowledge discovery in databases(KDD) is the process for extracting valid, novel, potentially useful and understandable knowledge form real data. There are many academic and industrial activities with new technologies and application areas. Particularly, data mining is the core step in the KDD process, consisting of many algorithms to perform clustering, pattern recognition and rule induction functions. The main goal of these algorithms is prediction and description. Prediction means the assessment of unknown variables. Description is concerned with providing understandable results in a compatible format to human users. We introduce an efficient data mining algorithm considering predictive and descriptive capability. Reasonable pattern is derived from real world data by a revised neural network model and a proposed fuzzy rule extraction technique is applied to obtain understandable knowledge. The proposed neural network model is a hierarchical self-organizing system. The rule base is compatible to decision makers perception because the generated fuzzy rule set reflects the human information process. Results from real world application are analyzed to evaluate the system\`s performance.

  • PDF

Finding Meaningful Pattern of Key Words in IIE Transactions Using Text Mining (텍스트마이닝을 활용한 산업공학 학술지의 논문 주제어간 연관관계 연구)

  • Cho, Su-Gon;Kim, Seoung-Bum
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.38 no.1
    • /
    • pp.67-73
    • /
    • 2012
  • Identification of meaningful patterns and trends in large volumes of text data is an important task in various research areas. In the present study we crawled the keywords from the abstracts in IIE Transactions, one of the representative journals in the field of Industrial Engineering from 1969 to 2011. We applied low-dimensional embedding method, clustering analysis, association rule, and social network analysis to find meaningful associative patterns of key words frequently appeared in the paper.

Comparing Machine Learning Classifiers for Movie WOM Opinion Mining

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.8
    • /
    • pp.3169-3181
    • /
    • 2015
  • Nowadays, online word-of-mouth has become a powerful influencer to marketing and sales in business. Opinion mining and sentiment analysis is frequently adopted at market research and business analytics field for analyzing word-of-mouth content. However, there still remain several challengeable areas for 1) sentiment analysis aiming for Korean word-of-mouth content in film market, 2) availability of machine learning models only using linguistic features, 3) effect of the size of the feature set. This study took a sample of 10,000 movie reviews which had posted extremely negative/positive rating in a movie portal site, and conducted sentiment analysis with four machine learning algorithms: naïve Bayesian, decision tree, neural network, and support vector machines. We found neural network and support vector machine produced better accuracy than naïve Bayesian and decision tree on every size of the feature set. Besides, the performance of them was boosting with increasing of the feature set size.

A Study on Data Mining Techniques in WSN Environment (WSN 환경에서의 데이터 마이닝 기법 연구)

  • Kim, Dong-Hyun;Kim, Min-Woo;Lee, Byung-Jun;Kim, Kyung-Tae;Youn, Hee-Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.37-38
    • /
    • 2018
  • 최근 인터넷 기술의 발달로 다수의 센서 노드로 구성된 Wireless Sensor Network (WSN) 환경이 증가하고 있으며, 이에 따라 무선으로 연결된 수많은 노드에 의해 생성되는 데이터의 양이 방대해지고 있지만, 데이터의 특성 및 패턴이 불규칙하여 기존 정적 분류 기법으로는 한계가 있다. 따라서 본 논문에서는 이러한 WSN 환경에서 생성되는 방대한 양의 데이터를 효율적으로 처리하기 위해 기계학습을 이용한 데이터 마이닝(Data mining) 기법에 대해 서술한다. 데이터 마이닝이란 데이터의 패턴 및 데이터 간의 관계를 이용하여 의사결정에 필요한 정보를 추출하는 것으로 다양한 기계 학습 알고리즘이 존재한다.

  • PDF

Analysis of CRM Using Neural Networks in Telecommunication service Market (통신시장에서 신경망을 통한 고객관리 분석)

  • 장일동
    • Journal of the Korea Society of Computer and Information
    • /
    • v.6 no.3
    • /
    • pp.29-34
    • /
    • 2001
  • Competition is increasing in telecommunication service market. Effective customer retention strategies are based on a clear understanding of customer defection. Data mining offers service providers great opportunities to get closer to customer. In this paper, we propose an efficient data mining algorithm using neural network. Especially Analysis of CRM Using Neural Networks in Telecommunication service Market and a practical application of neural network is described telco, churn management This paper builds model of customer defection management and analyzes customer defection with data mining

  • PDF

Identifying Core Robot Technologies by Analyzing Patent Co-classification Information

  • Jeon, Jeonghwan;Suh, Yongyoon;Koh, Jinhwan;Kim, Chulhyun;Lee, Sanghoon
    • Asian Journal of Innovation and Policy
    • /
    • v.8 no.1
    • /
    • pp.73-96
    • /
    • 2019
  • This study suggests a new approach for identifying core robot tech-nologies based on technological cross-impact. Specifically, the approach applies data mining techniques and multi-criteria decision-making methods to the co-classification information of registered patents on the robots. First, a cross-impact matrix is constructed with the confidence values by applying association rule mining (ARM) to the co-classification information of patents. Analytic network process (ANP) is applied to the co-classification frequency matrix for deriving weights of each robot technology. Then, a technique for order performance by similarity to ideal solution (TOPSIS) is employed to the derived cross-impact matrix and weights for identifying core robot technologies from the overall cross-impact perspective. It is expected that the proposed approach could help robot technology managers to formulate strategy and policy for technology planning of robot area.

Study Factors for Student Performance Applying Data Mining Regression Model Approach

  • Khan, Shakir
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.2
    • /
    • pp.188-192
    • /
    • 2021
  • In this paper, we apply data mining techniques and machine learning algorithms using R software, which is used to predict, here we applied a regression model to test some factor on the dataset for which we assumed that it effects student performance. Model was built on an existing dataset which contains many factors and the final grades. The factors tested are the attention to higher education, absences, study time, parent's education level, parent's jobs, and the number of failures in the past. The result shows that only study time and absences can affect the students' performance. Prediction of student academic performance helps instructors develop a good understanding of how well or how poorly the students in their classes will perform, so instructors can take proactive measures to improve student learning. This paper also focuses on how the prediction algorithm can be used to identify the most important attributes in a student's data.

Development of ML and IoT Enabled Disease Diagnosis Model for a Smart Healthcare System

  • Mehra, Navita;Mittal, Pooja
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.7
    • /
    • pp.1-12
    • /
    • 2022
  • The current progression in the Internet of Things (IoT) and Machine Learning (ML) based technologies converted the traditional healthcare system into a smart healthcare system. The incorporation of IoT and ML has changed the way of treating patients and offers lots of opportunities in the healthcare domain. In this view, this research article presents a new IoT and ML-based disease diagnosis model for the diagnosis of different diseases. In the proposed model, vital signs are collected via IoT-based smart medical devices, and the analysis is done by using different data mining techniques for detecting the possibility of risk in people's health status. Recommendations are made based on the results generated by different data mining techniques, for high-risk patients, an emergency alert will be generated to healthcare service providers and family members. Implementation of this model is done on Anaconda Jupyter notebook by using different Python libraries in it. The result states that among all data mining techniques, SVM achieved the highest accuracy of 0.897 on the same dataset for classification of Parkinson's disease.