• Title/Summary/Keyword: information classification

Search Result 8,303, Processing Time 0.029 seconds

Development Research of An Efficient Malware Classification System Using Hybrid Features And Machine Learning (하이브리드 특징 및 기계학습을 활용한 효율적인 악성코드 분류 시스템 개발 연구)

  • Yu, Jung-Been;Oh, Sang-Jin;Park, Leo-Hyun;Kwon, Tae-Kyoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.5
    • /
    • pp.1161-1167
    • /
    • 2018
  • In order to cope with dramatically increasing malware variant, malware classification research is getting diversified. Recent research tend to grasp individual limits of existing malware analysis technology (static/dynamic), and to change each method into "hybrid analysis", which is to mix different methods into one. Futhermore, it is applying machine learning to identify malware variant more accurately, which are difficult to classify. However, accuracy and scalability of trade-off problems that occur when using all kinds of methods are not yet to be solved, and it is still an important issue in the field of malware research. Therefore, to supplement and to solve the problems of the original malware classification research, we are focusing on developing a new malware classification system in this research.

Feature Selection Algorithm for Intrusions Detection System using Sequential Forward Search and Random Forest Classifier

  • Lee, Jinlee;Park, Dooho;Lee, Changhoon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.5132-5148
    • /
    • 2017
  • Cyber attacks are evolving commensurate with recent developments in information security technology. Intrusion detection systems collect various types of data from computers and networks to detect security threats and analyze the attack information. The large amount of data examined make the large number of computations and low detection rates problematic. Feature selection is expected to improve the classification performance and provide faster and more cost-effective results. Despite the various feature selection studies conducted for intrusion detection systems, it is difficult to automate feature selection because it is based on the knowledge of security experts. This paper proposes a feature selection technique to overcome the performance problems of intrusion detection systems. Focusing on feature selection, the first phase of the proposed system aims at constructing a feature subset using a sequential forward floating search (SFFS) to downsize the dimension of the variables. The second phase constructs a classification model with the selected feature subset using a random forest classifier (RFC) and evaluates the classification accuracy. Experiments were conducted with the NSL-KDD dataset using SFFS-RF, and the results indicated that feature selection techniques are a necessary preprocessing step to improve the overall system performance in systems that handle large datasets. They also verified that SFFS-RF could be used for data classification. In conclusion, SFFS-RF could be the key to improving the classification model performance in machine learning.

Proposing and Validating a Classification Method based on Knowledge Structure to Identify High-Quality Presentation Slides (고품질 슬라이드 선별을 위한 지식구조 기반 분류 기법)

  • Jung, Wonchul;Kim, Seongchan;Yi, Mun Y.
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.12
    • /
    • pp.676-681
    • /
    • 2014
  • In order to discern and classify high-quality slides, our research proposes a classification method that utilizes a knowledge structure containing information on the presentation slides. After analyzing whether our knowledge structure captures the content's quality information, we developed a classification method based on the knowledge structure produced from the analysis results. With the proposed method, we compared results classified by quality of presentation slides. Through this comparison, we verified that the slides in the high quality group could be classified and were able to retrieve high quality slides. The results show that, by utilizing the cognitive model of a knowledge structure, our method can increase the effectiveness of classification when search or recommendation is conducted mainly with high-quality slides.

Algorithm for Fault Detection and Classification Using Wavelet Singular Value Decomposition for Wide-Area Protection

  • Lee, Jae-Won;Kim, Won-Ki;Oh, Yun-Sik;Seo, Hun-Chul;Jang, Won-Hyeok;Kim, Yoon Sang;Park, Chul-Won;Kim, Chul-Hwan
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.3
    • /
    • pp.729-739
    • /
    • 2015
  • An algorithm for fault detection and classification method for wide-area protection in Korean transmission systems is proposed. The modeling of 345-kV and 765-kV Korean power system transmission networks using the Electro Magnetic Transient Program - Restructured Version (EMTP-RV) is presented and the algorithm for fault detection and classification in transmission lines is developed. The proposed algorithm uses the Wavelet Transform (WT) and Singular Value Decomposition (SVD). The Singular value of Approximation coefficient (SA) and part Sum of Detail coefficient (SD) are introduced. The characteristics of the SA and SD at the fault conditions are analyzed and used in the algorithm for fault detection and classification. The validation of the proposed algorithm is verified by various simulation results.

The database construction of a classification system using an optimal cluster analysis model (최적 클러스터 분석 모델을 이용한 분류시스템의 데이터베이스 구축)

  • 이현숙
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.4
    • /
    • pp.1045-1050
    • /
    • 1998
  • Classification techniques are often an importand component of intelligent systems and are use for both deta preprocessing and decision making. In the design of a classification system, the labled samples must be given to provide a priori information for the classification. Moreover, the number of classes to be categorized must be known a priori information, called OFCAM. In OFCAM, an unsupervised by OFCAM, the database of a classification system, called PCSDB, is constructed. Then, PCSDB can be effectively used in the decision process of the system.

  • PDF

Information Extraction and Sentence Classification applied to Clinical Trial MEDLINE Abstracts

  • Hara, Kazuo;Matsumoto, Yuji
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.85-90
    • /
    • 2005
  • In this paper, firstly we report experimental results on applying information extraction (IE) methodology to the task of summarizing clinical trial design information in focus on ‘Compared Treatment’, ‘Endpoint’ and ‘Patient Population’ from clinical trial MEDLINE abstracts. From these results, we have come to see this problem as one that can be decomposed into a sentence classification subtask and an IE subtask. By classifying sentences from clinical trial abstracts and only performing IE on sentences that are most likely to contain relevant information, we hypothesize that the accuracy of information extracted from the abstracts can be increased. As preparation for testing this theory in the next stage, we conducted an experiment applying state-of-the-art sentence classification techniques to the clinical trial abstracts and evaluated its potential in the original task of the summarization of clinical trial design information.

  • PDF

A Study on Analysis of the Template Component for the Development of BIM Template (BIM 템플릿 개발을 위한 템플릿 구성요소 분석에 관한 연구)

  • Lee, Sang Heon;Kim, Mi Kyoung;Choi, Hyun Ah;Jun, Han Jong
    • KIEAE Journal
    • /
    • v.11 no.2
    • /
    • pp.123-130
    • /
    • 2011
  • BIM based design methodology requires more information than traditional design methodology in order to insure efficiency throughout the project. BIM based design not only requires all building data in the form of 3D shapes, but also all other relevant data regarding building components. Information is typically grouped in a standard classification system such as by standardized material names. The development of a domestic BIM based standard classification system is yet to be created and deployed in the industry. Each designer is specifying their own building information classification systems which is causing inconsistency in the industry. Therefore BIM based designs, are causing confusion in the industry as each designer follow no guidelines for material standardization classification. The lack of information regarding this in the BIM template will continue to cause confusion about a projects building information data consistently. This study is that of preliminary research to develop a BIM template. First, overseas BIM templates were analyzed regarding BIM standards and documentation. Examination then followed regarding the element and characteristics needed for the development of a BIM template, a suggested hierarchy of elements required for a BIM template were then made. The result of this research is that it will be used to develop a "BIM template prototype", to support the generation of building information data regarding neighborhood facilities.

Two-Dimensional Qualitative Asset Analysis Method based on Business Process-Oriented Asset Evaluation

  • Eom, Jung-Ho;Park, Seon-Ho;Kim, Tae-Kyung;Chung, Tai-Myoung
    • Journal of Information Processing Systems
    • /
    • v.1 no.1 s.1
    • /
    • pp.79-85
    • /
    • 2005
  • In this paper, we dealt with substantial asset analysis methodology applied to two-dimensional asset classification and qualitative evaluation method according to the business process. Most of the existent risk analysis methodology and tools presented classification by asset type and physical evaluation by a quantitative method. We focused our research on qualitative evaluation with 2-dimensional asset classification. It converts from quantitative asset value with purchase cost, recovery and exchange cost, etc. to qualitative evaluation considering specific factors related to the business process. In the first phase, we classified the IT assets into tangible and intangible assets, including human and information data asset, and evaluated their value. Then, we converted the quantitative asset value to the qualitative asset value using a conversion standard table. In the second phase, we reclassified the assets using 2-dimensional classification factors reflecting the business process, and applied weight to the first evaluation results. This method is to consider the organization characteristics, IT asset structure scheme and business process. Therefore, we can evaluate the concrete and substantial asset value corresponding to the organization business process, even if they are the same asset type.

Adversarial Detection with Gaussian Process Regression-based Detector

  • Lee, Sangheon;Kim, Noo-ri;Cho, Youngwha;Choi, Jae-Young;Kim, Suntae;Kim, Jeong-Ah;Lee, Jee-Hyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.4285-4299
    • /
    • 2019
  • Adversarial attack is a technique that causes a malfunction of classification models by adding noise that cannot be distinguished by humans, which poses a threat to a deep learning model. In this paper, we propose an efficient method to detect adversarial images using Gaussian process regression. Existing deep learning-based adversarial detection methods require numerous adversarial images for their training. The proposed method overcomes this problem by performing classification based on the statistical features of adversarial images and clean images that are extracted by Gaussian process regression with a small number of images. This technique can determine whether the input image is an adversarial image by applying Gaussian process regression based on the intermediate output value of the classification model. Experimental results show that the proposed method achieves higher detection performance than the other deep learning-based adversarial detection methods for powerful attacks. In particular, the Gaussian process regression-based detector shows better detection performance than the baseline models for most attacks in the case with fewer adversarial examples.

Accuracy Evaluation of Supervised Classification by Using Morphological Attribute Profiles and Additional Band of Hyperspectral Imagery (초분광 영상의 Morphological Attribute Profiles와 추가 밴드를 이용한 감독분류의 정확도 평가)

  • Park, Hong Lyun;Choi, Jae Wan
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.25 no.1
    • /
    • pp.9-17
    • /
    • 2017
  • Hyperspectral imagery is used in the land cover classification with the principle component analysis and minimum noise fraction to reduce the data dimensionality and noise. Recently, studies on the supervised classification using various features having spectral information and spatial characteristic have been carried out. In this study, principle component bands and normalized difference vegetation index(NDVI) was utilized in the supervised classification for the land cover classification. To utilize additional information not included in the principle component bands by the hyperspectral imagery, we tried to increase the classification accuracy by using the NDVI. In addition, the extended attribute profiles(EAP) generated using the morphological filter was used as the input data. The random forest algorithm, which is one of the representative supervised classification, was used. The classification accuracy according to the application of various features based on EAP was compared. Two areas was selected in the experiments, and the quantitative evaluation was performed by using reference data. The classification accuracy of the proposed algorithm showed the highest classification accuracy of 85.72% and 91.14% compared with existing algorithms. Further research will need to develop a supervised classification algorithm and additional input datasets to improve the accuracy of land cover classification using hyperspectral imagery.