• 제목/요약/키워드: sequential selection

검색결과 191건 처리시간 0.024초

Sequential Pattern Mining for Intrusion Detection System with Feature Selection on Big Data

  • Fidalcastro, A;Baburaj, E
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권10호
    • /
    • pp.5023-5038
    • /
    • 2017
  • Big data is an emerging technology which deals with wide range of data sets with sizes beyond the ability to work with software tools which is commonly used for processing of data. When we consider a huge network, we have to process a large amount of network information generated, which consists of both normal and abnormal activity logs in large volume of multi-dimensional data. Intrusion Detection System (IDS) is required to monitor the network and to detect the malicious nodes and activities in the network. Massive amount of data makes it difficult to detect threats and attacks. Sequential Pattern mining may be used to identify the patterns of malicious activities which have been an emerging popular trend due to the consideration of quantities, profits and time orders of item. Here we propose a sequential pattern mining algorithm with fuzzy logic feature selection and fuzzy weighted support for huge volumes of network logs to be implemented in Apache Hadoop YARN, which solves the problem of speed and time constraints. Fuzzy logic feature selection selects important features from the feature set. Fuzzy weighted supports provide weights to the inputs and avoid multiple scans. In our simulation we use the attack log from NS-2 MANET environment and compare the proposed algorithm with the state-of-the-art sequential Pattern Mining algorithm, SPADE and Support Vector Machine with Hadoop environment.

Comparison of Feature Selection Processes for Image Retrieval Applications

  • Choi, Young-Mee;Choo, Moon-Won
    • 한국멀티미디어학회논문지
    • /
    • 제14권12호
    • /
    • pp.1544-1548
    • /
    • 2011
  • A process of choosing a subset of original features, so called feature selection, is considered as a crucial preprocessing step to image processing applications. There are already large pools of techniques developed for machine learning and data mining fields. In this paper, basically two methods, non-feature selection and feature selection, are investigated to compare their predictive effectiveness of classification. Color co-occurrence feature is used for defining image features. Standard Sequential Forward Selection algorithm are used for feature selection to identify relevant features and redundancy among relevant features. Four color spaces, RGB, YCbCr, HSV, and Gaussian space are considered for computing color co-occurrence features. Gray-level image feature is also considered for the performance comparison reasons. The experimental results are presented.

A Bayes Sequential Selection of the Least Probale Event

  • Hwang, Hyung-Tae;Kim, Woo-Chul
    • Journal of the Korean Statistical Society
    • /
    • 제11권1호
    • /
    • pp.25-35
    • /
    • 1982
  • A problem of selecting the least probable cell in a multinomial distribution is studied in a Bayesian framework. We consider two loss components the cost of sampling and the difference in cell probabilities between the selected and the least probable cells. A Bayes sequential selection rule is derived with respect to a Dirichlet prior, and it is compared with the best fixed sample size selection rule. The continuation sets with respect to the vague prior are tabulated for certain cases.

  • PDF

Wine Quality Assessment Using a Decision Tree with the Features Recommended by the Sequential Forward Selection

  • Lee, Seunghan;Kang, Kyungtae;Noh, Dong Kun
    • 한국컴퓨터정보학회논문지
    • /
    • 제22권2호
    • /
    • pp.81-87
    • /
    • 2017
  • Nowadays wine is increasingly enjoyed by a wider range of consumers, and wine certification and quality assessment are key elements in supporting the wine industry to develop new technologies for both wine making and selling processes. There have been many attempts to construct a more methodical approach to the assessment of wines, but most of them rely on objective decision rather than subjective judgement. In this paper, we propose a data mining approach to predict human wine taste preferences that is based on easily available analytical tests at the certification step. We used sequential forward selection and decision tree for this purpose. Experiments with the wine quality dataset from the UC Irvine Machine Learning Repository demonstrate the accuracies of 76.7% and 78.7% for red and white wines respectively.

A Novel Action Selection Mechanism for Intelligent Service Robots

  • Suh, Il-Hong;Kwon, Woo-Young;Lee, Sang-Hoon
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2003년도 ICCAS
    • /
    • pp.2027-2032
    • /
    • 2003
  • For action selection as well as learning, simple associations between stimulus and response have been employed in most of literatures. But, for a successful task accomplishment, it is required that an animat can learn and express behavioral sequences. In this paper, we propose a novel action-selection-mechanism to deal with sequential behaviors. For this, we define behavioral motivation as a primitive node for action selection, and then hierarchically construct a network with behavioral motivations. The vertical path of the network represents behavioral sequences. Here, such a tree for our proposed ASM can be newly generated and/or updated, whenever a new sequential behaviors is learned. To show the validity of our proposed ASM, three 2-D grid world simulations will be illustrated.

  • PDF

Geometry-Based Sensor Selection for Large Wireless Sensor Networks

  • Kim, Yoon Hak
    • Journal of information and communication convergence engineering
    • /
    • 제12권1호
    • /
    • pp.8-13
    • /
    • 2014
  • We consider the sensor selection problem in large sensor networks where the goal is to find the best set of sensors that maximizes application objectives. Since sensor selection typically involves a large number of sensors, a low complexity should be maintained for practical applications. We propose a geometry-based sensor selection algorithm that utilizes only the information of sensor locations. In particular, by observing that sensors clustered together tend to have redundant information, we theorize that the redundancy is inversely proportional to the distance between sensors and seek to minimize this redundancy by searching for a set of sensors with the maximum average distance. To further reduce the computational complexity, we perform an iterative sequential search without losing optimality. We apply the proposed algorithm to an acoustic sensor network for source localization, and demonstrate using simulations that the proposed algorithm yields significant improvements in the localization performance with respect to the randomly generated sets of sensors.

Feature Selection Algorithm for Intrusions Detection System using Sequential Forward Search and Random Forest Classifier

  • Lee, Jinlee;Park, Dooho;Lee, Changhoon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권10호
    • /
    • pp.5132-5148
    • /
    • 2017
  • Cyber attacks are evolving commensurate with recent developments in information security technology. Intrusion detection systems collect various types of data from computers and networks to detect security threats and analyze the attack information. The large amount of data examined make the large number of computations and low detection rates problematic. Feature selection is expected to improve the classification performance and provide faster and more cost-effective results. Despite the various feature selection studies conducted for intrusion detection systems, it is difficult to automate feature selection because it is based on the knowledge of security experts. This paper proposes a feature selection technique to overcome the performance problems of intrusion detection systems. Focusing on feature selection, the first phase of the proposed system aims at constructing a feature subset using a sequential forward floating search (SFFS) to downsize the dimension of the variables. The second phase constructs a classification model with the selected feature subset using a random forest classifier (RFC) and evaluates the classification accuracy. Experiments were conducted with the NSL-KDD dataset using SFFS-RF, and the results indicated that feature selection techniques are a necessary preprocessing step to improve the overall system performance in systems that handle large datasets. They also verified that SFFS-RF could be used for data classification. In conclusion, SFFS-RF could be the key to improving the classification model performance in machine learning.

패턴 인식문제를 위한 유전자 알고리즘 기반 특징 선택 방법 개발 (Genetic Algorithm Based Feature Selection Method Development for Pattern Recognition)

  • 박창현;김호덕;양현창;심귀보
    • 한국지능시스템학회논문지
    • /
    • 제16권4호
    • /
    • pp.466-471
    • /
    • 2006
  • 패턴 인식 문제에서 중요한 전처리 과정 중 하나는 특정을 선택하거나 추출하는 부분이다. 특정을 추출하는 방법으로는 PCA가 보통 사용되고 특정을 선택하는 방법으로는 SFS 나 SBS 등의 방법들이 자주 사용되고 있다. 본 논문은 진화 연산 방법으로써 비선형 최적화 문제에서 유용하게 사용되어 지고 있는 유전자 알고리즘을 특정 선택에 적용하는 유전자 알고리즘 특정 선택 (Genetic Algorithm Feature Selection: GAFS)방법을 개발하여 다른 특징 선택 알고리즘과의 비교를 통해 본 알고리즘의 성능을 관찰한다.

Gaussian Mixture Model을 이용한 다중 범주 분류를 위한 특징벡터 선택 알고리즘 (Feature Selection for Multi-Class Genre Classification using Gaussian Mixture Model)

  • 문선국;최택성;박영철;윤대희
    • 한국통신학회논문지
    • /
    • 제32권10C호
    • /
    • pp.965-974
    • /
    • 2007
  • 본 논문에서는 내용 기반 음악 범주 분류 시스템에서 다중 범주를 위한 특징벡터 선택 알고리즘을 제안한다. 제안된 특징벡터 선택 알고리즘은 분리 성능을 측정할 때 가우시안 혼합 모델(Gaussian Mixture Model: GMM)을 기반으로 GMM separation score을 측정함으로써 확률분포 및 분리 성능 추정의 정확도를 높였고, sequential forward selection 방법을 개선하여 이전까지 선택된 특징벡터들이 분리를 잘 하지 못하는 범주들을 기준으로 다음 특징벡터를 선택하는 알고리즘을 제안하여 다중 범주 분류의 성능을 높였다. 제안된 알고리즘의 성능 검증을 위해 음색, 리듬, 피치 등 오디오 신호의 특징을 나타내는 다양한 파라미터를 오디오 신호로부터 추출하여 제안된 특징벡터 선택 알고리즘과 기존의 알고리즘으로 특징벡터를 선택한 후 GMM classifier와 k-NN classifier를 이용하여 분류 성능을 평가하였다. 제안된 특징벡터 선택 알고리즘은 기존 알고리즘에 비하여 3%에서 8% 정도의 분류 성능이 향상된 것을 확인할 수 있었고 특히 낮은 차원의 특징벡터의 분류 실험에서는 분류 정확도 측면에서 5%에서 10% 향상된 좋은 성능을 보였다.