• Title/Summary/Keyword: 특징 집합 선택

Search Result 112, Processing Time 0.021 seconds

Korea Information Science Society (유전자 알고리즘을 이용한 홍채 특징 추출)

  • 원현석;손병준;이일병
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.04b
    • /
    • pp.826-828
    • /
    • 2004
  • 홍채인식 시스템은 영상획득, 전처리, 특징 추출, 패턴 정합의 단계로 이루어져 있다. 이 중 특징 추출은 특징 차원의 감소뿐만 아니라 분류 정착도의 증가를 위한 필수적인 과정이다. 본 논문에서는 특징을 추출하는데 있어서, 홍채데이타에 웨이블렛 변환의 다해상도 분석 기법을 시도하여 일정 영역을 추출한 후, 그 영역에 유전자 알고리즘(Genetic Algorithm)을 적용하여 가장 분별력 있는 특징들만을 추출 및 사용하는 홍채인식 시스템을 제안한다. 유전자 알고리즘의 선택연산자로는 적응도 비례 방식과 전역 엘리트 방식을 사용하였으며, 적합도 함수로는 Gaussian Kernel을 사용하는 Support Vector Machine(SVM)을 사용하였다. 본 시스템을 통해 나온 최적의 특징집합을 이용한 SVM분류기로 인식률을 알아본 결과 웨이블렛만을 사용했을 때 보다 대략 1.5%정도 더 좋은 인식률을 얻을 수 있었다.

  • PDF

An Active Learning-based Method for Composing Training Document Set in Bayesian Text Classification Systems (베이지언 문서분류시스템을 위한 능동적 학습 기반의 학습문서집합 구성방법)

  • 김제욱;김한준;이상구
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.12
    • /
    • pp.966-978
    • /
    • 2002
  • There are two important problems in improving text classification systems based on machine learning approach. The first one, called "selection problem", is how to select a minimum number of informative documents from a given document collection. The second one, called "composition problem", is how to reorganize selected training documents so that they can fit an adopted learning method. The former problem is addressed in "active learning" algorithms, and the latter is discussed in "boosting" algorithms. This paper proposes a new learning method, called AdaBUS, which proactively solves the above problems in the context of Naive Bayes classification systems. The proposed method constructs more accurate classification hypothesis by increasing the valiance in "weak" hypotheses that determine the final classification hypothesis. Consequently, the proposed algorithm yields perturbation effect makes the boosting algorithm work properly. Through the empirical experiment using the Routers-21578 document collection, we show that the AdaBUS algorithm more significantly improves the Naive Bayes-based classification system than other conventional learning methodson system than other conventional learning methods

A Performance Evaluation of Indexing Methods for Content-based Retrieval of High Dimensional Multimedia Data (고차원 멀티미디어 데이터에 대한 내용기반 검색을 위한 인덱싱 방법들의 성능 평가)

  • Moon, Joo-Sun;Choi, Jeong-Hoon;Nang, Jong-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06a
    • /
    • pp.345-346
    • /
    • 2008
  • 멀티미디어 데이터베이스의 효과적인 내용 기반 검색을 위한 많은 색인 방법들이 연구되어왔지만 정작 동일한 데이터 집합과 동일한 평가 기준으로 서로 다른 검색 방법들의 성능을 분석한 실험은 이뤄지지 않았다. 본 논문에서는 기존의 대표적인 색인 방법들을 구현하고 공통의 데이터 집합에 대한 색인 검색을 여러 성능 측정 기준에 따라 분석함으로써 각 색인 방법들의 특징 및 성능을 객관적으로 평가하였다. 향후 본 논문에서 실험한 결과들을 이용하면 특정 데이터 집합에 효과적인 색인 방법을 선택할 수 있을 것이다.

  • PDF

Enhancing Document Clustering Method using Synonym of Cluster Topic and Similarity (군집 주제의 유의어와 유사도를 이용한 문서군집 향상 방법)

  • Park, Sun;Kim, Chul-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.1538-1541
    • /
    • 2011
  • 본 논문은 군집 주제의 유의어와 유사도를 이용하여 문서군집의 성능을 향상시키는 방법을 제안한다. 제안된 방법은 비음수행렬분해의 의미특징을 이용하여 군집 주제(topic)의 용어들을 선택함으로서 문서 군집 집합의 내부구조를 잘 표현할 수 있으며, 군집 주제의 용어들에 워드넷의 유의어를 사용하여서 확장함으로써 문서를 용어집합(bag-of-words)으로 표현하는 문제를 해결할 수 있다. 또한 확장된 군집 주제의 용어와 문서집합에 코사인 유사도를 이용하여서 군집의 주제에 적합한 문서를 잘 군집하여서 성능을 높일 수 있다. 실험결과 제안방법을 적용한 문서군집방법이 다른 문서군집 방법에 비하여 좋은 성능을 보인다.

A Method to Find Feature Set for Detecting Various Denial Service Attacks in Power Grid (전력망에서의 다양한 서비스 거부 공격 탐지 위한 특징 선택 방법)

  • Lee, DongHwi;Kim, Young-Dae;Park, Woo-Bin;Kim, Joon-Seok;Kang, Seung-Ho
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.2 no.2
    • /
    • pp.311-316
    • /
    • 2016
  • Network intrusion detection system based on machine learning method such as artificial neural network is quite dependent on the selected features in terms of accuracy and efficiency. Nevertheless, choosing the optimal combination of features, which guarantees accuracy and efficienty, from generally used many features to detect network intrusion requires extensive computing resources. In this paper, we deal with a optimal feature selection problem to determine 6 denial service attacks and normal usage provided by NSL-KDD data. We propose a optimal feature selection algorithm. Proposed algorithm is based on the multi-start local search algorithm, one of representative meta-heuristic algorithm for solving optimization problem. In order to evaluate the performance of our proposed algorithm, comparison with a case of all 41 features used against NSL-KDD data is conducted. In addtion, comparisons between 3 well-known machine learning methods (multi-layer perceptron., Bayes classifier, and Support vector machine) are performed to find a machine learning method which shows the best performance combined with the proposed feature selection method.

Protein-Protein Interaction Reliability Enhancement System based on Feature Selection and Classification Technique (특징 추출과 분석 기법에 기반한 단백질 상호작용 데이터 신뢰도 향상 시스템)

  • Lee, Min-Su;Park, Seung-Soo;Lee, Sang-Ho;Yong, Hwan-Seung;Kang, Sung-Hee
    • The KIPS Transactions:PartB
    • /
    • v.13B no.7 s.110
    • /
    • pp.679-688
    • /
    • 2006
  • Protein-protein interaction data obtained from high-throughput experiments includes high false positives. In this paper, we introduce a new protein-protein interaction reliability verification system. The proposed system integrates various biological features related with protein-protein interactions, and then selects the most relevant and informative features among them using a feature selection method. To assess the reliability of each protein-protein interaction data, the system construct a classifier that can distinguish true interacting protein pairs from noisy protein-protein interaction data based on the selected biological evidences using a classification technique. Since the performance of feature selection methods and classification techniques depends heavily upon characteristics of data, we performed rigorous comparative analysis of various feature selection methods and classification techniques to obtain optimal performance of our system. Experimental results show that the combination of feature selection method and classification algorithms provide very powerful tools in distinguishing true interacting protein pairs from noisy protein-protein interaction dataset. Also, we investigated the effects on performances of feature selection methods and classification techniques in the proposed protein interaction verification system.

Extraction and classification of characteristic information of malicious code for an intelligent detection model (지능적 탐지 모델을 위한 악의적인 코드의 특징 정보 추출 및 분류)

  • Hwang, Yoon-Cheol
    • Journal of Industrial Convergence
    • /
    • v.20 no.5
    • /
    • pp.61-68
    • /
    • 2022
  • In recent years, malicious codes are being produced using the developing information and communication technology, and it is insufficient to detect them with the existing detection system. In order to accurately and efficiently detect and respond to such intelligent malicious code, an intelligent detection model is required, and in order to maximize detection performance, it is important to train with the main characteristic information set of the malicious code. In this paper, we proposed a technique for designing an intelligent detection model and generating the data required for model training as a set of key feature information through transformation, dimensionality reduction, and feature selection steps. And based on this, the main characteristic information was classified by malicious code. In addition, based on the classified characteristic information, we derived common characteristic information that can be used to analyze and detect modified or newly emerging malicious codes. Since the proposed detection model detects malicious codes by learning with a limited number of characteristic information, the detection time and response are fast, so damage can be greatly reduced and Although the performance evaluation result value is slightly different depending on the learning algorithm, it was found through evaluation that most malicious codes can be detected.

Survey for Objective Performance Evaluation of Skyline Query Methods (스카이라인 질의 기법의 객관적 성능 평가를 위한 연구 조사)

  • Choi, Jong-Hyeok;Nasridinov, Aziz
    • Proceedings of The KACE
    • /
    • 2018.01a
    • /
    • pp.49-52
    • /
    • 2018
  • 스카이라인 질의는 데이터들 사이의 비교 연산을 통해 지배되지 않은 데이터들의 최소 집합을 스카이라인으로 탐색하며 이때 지배되지 않고 스카이라인으로 선택된 데이터들은 지배된 데이터들을 대표하게 된다. 이러한 특징은 금융, 네트워크, 웹서비스 등 다양한 분야에서 스카이라인의 활용을 이끌어냈다. 하지만 스카이라인 질의는 데이터의 양이나 차원의 수가 증가하는 경우 전체적인 성능이 크게 감소하는 문제를 야기하기 때문에 이를 해결하기 위한 다양한 기법들이 연구 및 제안되고 있다. 하지만 실제 스카이라인 질의를 활용하기 위해서는 객관적 성능 평가를 통해 주어진 상황에서 최적의 성능을 보일 수 있는 기법을 선택해야할 필요가 있지만 기존의 연구들은 성능 평가에 있어 각 기법이 목표한 문제들에 대한 단편적인 실험을 진행하고 있기 때문에 이들을 객관적으로 평가하기 위해서는 새로운 스카이라인 성능 평가 방법이 필요한 실정이다. 본 논문에서는 이러한 문제를 해결하기에 앞서 스카이라인 질의 기법의 객관적 성능 평가를 위한 품질 요소 선택 기준을 선택하기 위해 기존 연구들에 대한 조사와 분석을 진행한다.

  • PDF

An Effective Face Authentication Method for Resource - Constrained Devices (제한된 자원을 갖는 장치에서 효과적인 얼굴 인증 방법)

  • Lee Kyunghee;Byun Hyeran
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.9
    • /
    • pp.1233-1245
    • /
    • 2004
  • Though biometrics to authenticate a person is a good tool in terms of security and convenience, typical authentication algorithms using biometrics may not be executed on resource-constrained devices such as smart cards. Thus, to execute biometric processing on resource-constrained devices, it is desirable to develop lightweight authentication algorithm that requires only small amount of memory and computation. Also, among biological features, face is one of the most acceptable biometrics, because humans use it in their visual interactions and acquiring face images is non-intrusive. We present a new face authentication algorithm in this paper. Our achievement is two-fold. One is to present a face authentication algorithm with low memory requirement, which uses support vector machines (SVM) with the feature set extracted by genetic algorithms (GA). The other contribution is to suggest a method to reduce further, if needed, the amount of memory required in the authentication at the expense of verification rate by changing a controllable system parameter for a feature set size. Given a pre-defined amount of memory, this capability is quite effective to mount our algorithm on memory-constrained devices. The experimental results on various databases show that our face authentication algorithm with SVM whose input vectors consist of discriminating features extracted by GA has much better performance than the algorithm without feature selection process by GA has, in terms of accuracy and memory requirement. Experiment also shows that the number of the feature ttl be selected is controllable by a system parameter.

Performance Improvement of Web Document Classification through Incorporation of Feature Selection and Weighting (특징선택과 특징가중의 융합을 통한 웹문서분류 성능의 개선)

  • Lee, Ah-Ram;Kim, Han-Joon;Man, Xuan
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.4
    • /
    • pp.141-148
    • /
    • 2013
  • Automated classification systems which utilize machine learning develops classification models through learning process, and then classify unknown data into predefined set of categories according to the model. The performance of machine learning-based classification systems relies greatly upon the quality of features composing classification models. For textual data, we can use their word terms and structure information in order to generate the set of features. Particularly, in order to extract feature from Web documents, we need to analyze tag and hyperlink information. Recent studies on Web document classification focus on feature engineering technology other than machine learning algorithms themselves. Thus this paper proposes a novel method of incorporating feature selection and weighting which can improves classification models effectively. Through extensive experiments using Web-KB document collections, the proposed method outperforms conventional ones.