• Title/Summary/Keyword: 이진 분류

Search Result 607, Processing Time 0.027 seconds

Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA

  • Jeon, Dong-Ha;Lee, Soo-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.123-130
    • /
    • 2022
  • Recently, studies on the detection and classification of Android malware based on API Call sequence have been actively carried out. However, API Call sequence based malware classification has serious limitations such as excessive time and resource consumption in terms of malware analysis and learning model construction due to the vast amount of data and high-dimensional characteristic of features. In this study, we analyzed various classification models such as LightGBM, Random Forest, and k-Nearest Neighbors after significantly reducing the dimension of features using PCA(Principal Component Analysis) for CICAndMal2020 dataset containing vast API Call information. The experimental result shows that PCA significantly reduces the dimension of features while maintaining the characteristics of the original data and achieves efficient malware classification performance. Both binary classification and multi-class classification achieve higher levels of accuracy than previous studies, even if the data characteristics were reduced to less than 1% of the total size.

Re-examination of the vascular plants on Hongdo Island, Korea (홍도 관속식물상 재검토)

  • JANG, Young-Jong;PARK, Jong-Soo;LEE, Jin-Sil;LEE, Ji-Yeon;CHOI, Byoung-Hee
    • Korean Journal of Plant Taxonomy
    • /
    • v.51 no.3
    • /
    • pp.205-249
    • /
    • 2021
  • This study was carried out to survey the flora of Hongdo Island in Sinan-gun, Jeollanam-do, South Korea. Specimens collected from previous Hongdo flora studies were reexamined using a relevant biodiversity database, and field surveys were carried out 22 times from April of 2003 to October of 2020. Based on the specimens collected during both previous studies and this study, the identified vascular plants of Hongdo consisted of 472 taxa comprising 102 families, 296 genera, 425 species, 6 subspecies, and 41 varieties. Among them, 111 taxa are newly recorded in this study, and 6 taxa are described in detail in terms of their morphological characteristics and habitat. Also, 29 taxa were reviewed or re-identified with corresponding taxonomic annotations. In Korea, Hongdo represents the northern distributional limit of the 4 taxa of Goodyera biflora, Damnacanthus major, Calanthe aristulifera, and Hemerocallis hongdoensis. Moreover, Hosta yingeri and Saussurea polylepis are endemic to Hongdo and nearby islands in Korea. Distribution maps of these species were prepared. Protected species designated by the Ministry of Environment were 7 taxa consisting of 2 taxa of level I, specifically Sedirea japonica and Neofinetia falcata, and 5 taxa of level II, which were Cymbidium macrorhizon, Woodwardia japonica, Dendrobium moniliforme, Calanthe aristulifera, and Bulbophyllum inconspicuum. Red list plants as designated by the National Institute of Biological Resources numbered 11 taxa. Naturalized plants numbered 40 taxa.

Robust Feature Parameter for Implementation of Speech Recognizer Using Support Vector Machines (SVM음성인식기 구현을 위한 강인한 특징 파라메터)

  • 김창근;박정원;허강인
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.3
    • /
    • pp.195-200
    • /
    • 2004
  • In this paper we propose effective speech recognizer through two recognition experiments. In general, SVM is classification method which classify two class set by finding voluntary nonlinear boundary in vector space and possesses high classification performance under few training data number. In this paper we compare recognition performance of HMM and SVM at training data number and investigate recognition performance of each feature parameter while changing feature space of MFCC using Independent Component Analysis(ICA) and Principal Component Analysis(PCA). As a result of experiment, recognition performance of SVM is better than 1:.um under few training data number, and feature parameter by ICA showed the highest recognition performance because of superior linear classification.

Phylogenetic Analysis of Nuclear Ribosomal DNA Intergenic Spacer (IGS) I Region of Phellinus linteus (Nuclear Ribosomal DNA Intergenic Spacer(IGS) I 영역의 분석에 의한 목질진흙버섯의 계통분류학적 위치)

  • Rew, Young-Hyun;Lee, Jin-Hyung;Kim, Jong-Guk
    • The Korean Journal of Mycology
    • /
    • v.32 no.2
    • /
    • pp.148-151
    • /
    • 2004
  • This study was carried out to elucidate phylogenetic relationship of a yellow lump, Phellinus linteus by comparing the nuclear ribosomal intergenic spacer (IGS) I region with that of other genera of basidiomycetes retrieved from Genbank. IGS I region of Phellinus linteus was 730 bp long and sequence homology was conserved in the 5' region, in particular $1{\sim}280\;bp$, and decreased in the direction toward the 3' end. ITS region was widely studied in phylogenies related to basidiomycetes, but IGS region was not well understood yet. Our study indicated that IGS region can be a good tool in phylogenetic study of basidiomycetes.

Numerical Analysis of Turbulent Swirling Cold-Flow in a Cyclonic Coal Gasifier (선회분류층형 석탄가스화기내의 비반응 난류 선회유동장 해석)

  • 이진욱;나혜령;윤용승
    • Journal of Energy Engineering
    • /
    • v.6 no.2
    • /
    • pp.137-144
    • /
    • 1997
  • Turbulent swirling cold-flow in a cyclonic gasifier has been analyzed by numerical analysis. Comparison of two dimensional and three dimensional analyses has shown that concept of equivalent slit is appropriate for the two dimensionalization of three dimensional phenomena. Flow characteristics have been scrutinized by varying swirl number which is a crucial parameter in determining the flow pattern of the cyclonic gasifier. Reactive flow field has been estimated by using theoretical swirl number and equivalent slit width for reactive flow. Results show that proper flow field for the reactive coal gasification can be formed by controlling the exit area and azimuthal location of coal burners.

  • PDF

An Effective Concept Drift Detection Method on Streaming Data Using Probability Estimates (스트리밍 데이터에서 확률 예측치를 이용한 효과적인 개념 변화 탐지 방법)

  • Kim, Young-In;Park, Cheong Hee
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.718-723
    • /
    • 2016
  • In streaming data analysis, detecting concept drift accurately is important to maintain the performance of classification model. Error rates are usually used for concept drift detection. However, by describing prediction results with only binary values of 0 or 1, useful information about a behavior pattern of a classifier can be lost. In this paper, we propose an effective concept drift detection method which describes performance pattern of a classifier by utilizing probability estimates for class prediction and detects a significant change in a classifier behavior. Experimental results on synthetic and real streaming data show the efficiency of the proposed method for detecting the occurrence of concept drift.

A EMG Signal Processing Algorithm for SMUAP Pattern Classification (SMUAP의 패턴분류를 위한 근 신호처리 알고리듬)

  • Lee, Jin;Jo, Il-Jun;Byun, Youn-Shik;Hong, Woan-Hue;Kim, Sung-Hwan
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.7
    • /
    • pp.106-111
    • /
    • 1989
  • A new EMG signal processing algorithm for SMUAP pattern classification is proposed. It checks the combination and regularity of ISI using a spike counter as a decision making routine, and performs SMUAP waveform alignment in frequency domain and selects spikes through FIR filtering. As a result, with the EMG signals recorded during 5 seconds at 10-50% MVC force level, the SMUAP ranged from five to nine units were classified and identification rate is greater than 55 percent using a concentric needle electrode. In the IBM PC/AT the processing time typically required 2 minutes.

  • PDF

Software Quality Prediction based on Defect Severity (결함 심각도에 기반한 소프트웨어 품질 예측)

  • Hong, Euy-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.5
    • /
    • pp.73-81
    • /
    • 2015
  • Most of the software fault prediction studies focused on the binary classification model that predicts whether an input entity has faults or not. However the ability to predict entity fault-proneness in various severity categories is more useful because not all faults have the same severity. In this paper, we propose fault prediction models at different severity levels of faults using traditional size and complexity metrics. They are ternary classification models and use four machine learning algorithms for their training. Empirical analysis is performed using two NASA public data sets and a performance measure, accuracy. The evaluation results show that backpropagation neural network model outperforms other models on both data sets, with about 81% and 88% in terms of accuracy score respectively.

A Study of Pattern Classification using Load Profile Data (Load Profile 데이터를 이용한 패턴분류 연구)

  • Yu In Hyeob;Lee Jin Ki;Kim Sun Ic;Ko Jong Min
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.841-843
    • /
    • 2005
  • 최근에 들어서 전력산업에 규제완화가 도입되면서 환경이 급변하고 있는 실정이다. 여러 가지의 환경변화가 예상되지만, 그 중에서도 공급자간에 경쟁 도입이 전력산업 참여자간에 주요 이슈로 부상하고 있다. 이와 같은 변화는 전력시스템의 기술 개발 뿐만 아니라 경영전략에도 큰 영향을 미치고 있으며, 대 수요자 서비스의 제공이 전략의 핵심이 되고 있다. 따라서 공급자는 보다 나은 서비스를 제공하기 위해서, 수요자 정보의 수집 및 분석을 해야 할 필요가 있다. 이와 같은 수요자 정보의 분석은 여러분야가 있지만 그 중에서도 수요특성을 파악하는 것이 가장 기본이 된다. 수요 특성은 원격검침시스템에서 수집되는 load profile 데이터로써 표현된다. 본 논문에서는 전력 수요자의 부하 특성을 분석하고 명가하기 위하여 수요특성별로 그룹으로 분류하는 방법을 개발하고, 분류된 그룹의 특징을 검토하였다. 이와 같은 부하분석의 정보는 가격설계, 수요 핀 에너지 예측, 송전 및 배전 계획, 에너지 효율 향상 및 부하관리의 필수 자료가 된다. 또한 향후에 개발될 전력 부가서비스의 주요 기반이 될 것으로 예상된다.

  • PDF

Spam Filter by Using X2 Statistics and Support Vector Machines (카이제곱 통계량과 지지벡터기계를 이용한 스팸메일 필터)

  • Lee, Song-Wook
    • The KIPS Transactions:PartB
    • /
    • v.17B no.3
    • /
    • pp.249-254
    • /
    • 2010
  • We propose an automatic spam filter for e-mail data using Support Vector Machines(SVM). We use a lexical form of a word and its part of speech(POS) tags as features and select features by chi square statistics. We represent each feature by TF(text frequency), TF-IDF, and binary weight for experiments. After training SVM with the selected features, SVM classifies each e-mail as spam or not. In experiment, the selected features improve the performance of our system and we acquired overall 98.9% of accuracy with TREC05-p1 spam corpus.