Search | Korea Science

Performance Comparison of Machine Learning Algorithms for Malware Detection (악성코드 탐지를 위한 기계학습 알고리즘의 성능 비교)

Lee, Hyun-Jong;Heo, Jae Hyeok;Hwang, Doosung
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2018.01a
- /
- pp.143-146
- /
- 2018
서명기반 악성코드 탐지는 악성 파일의 고유 해싱 값을 사용하거나 패턴화된 공격 규칙을 이용하므로, 변형된 악성코드 탐지에 취약한 단점이 있다. 기계 학습을 적용한 악성코드 탐지는 이러한 취약점을 극복할 수 있는 방안으로 인식되고 있다. 본 논문은 정적 분석으로 n-gram과 API 특징점을 추출해 특징 벡터로 구성하여 XGBoost, k-최근접 이웃 알고리즘, 지지 벡터 기기, 신경망 알고리즘, 심층 학습 알고리즘의 일반화 성능을 비교한다. 실험 결과로 XGBoost가 일반화 성능이 99%로 가장 우수했으며 k-최근접 이웃 알고리즘이 학습 시간이 가장 적게 소요됐다. 일반화 성능과 시간 복잡도 측면에서 XGBoost가 비교 대상 알고리즘에 비해 우수한 성능을 보였다.
PDF

Comments Classification System using Support Vector Machines and Topic Signature (지지 벡터 기계와 토픽 시그너처를 이용한 댓글 분류 시스템 언어에 독립적인 댓글 분류 시스템)

Bae, Min-Young;En, Ji-Hyun;Jang, Du-Sung;Cha, Jeong-Won
- 한국HCI학회:학술대회논문집
- /
- 2009.02a
- /
- pp.263-266
- /
- 2009
Comments are short and not use spacing words or comma more than general document. We convert the 7-gram into 3-gram and select key features using topic signature. Topic signature is widely used for selecting features in document classification and summarization. We use the SVM(Support Vector Machines) as a classifier. From the result of experiments, we can see that the proposed method is outstanding over the previous methods. The proposed system can also apply to other languages.
PDF

A Swearword Filter System for Online Game Chatting (온라인게임 채팅에서의 비속어 차단시스템)

Lee, Song-Wook
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.15 no.7
- /
- pp.1531-1536
- /
- 2011
We propose an automatic swearword filter system for online game chatting by using Support Vector Machines(SVM). We collected chatting sentences from online games and tagged them as normal sentences or swearword included sentences. We use n-gram syllables and lexical-part of speech (POS) tags of a word as features and select useful features by chi square statistics. Each selected feature is represented as binary weight and used in training SVM. SVM classifies each chatting sentence as swearword included one or not. In experiment, we acquired overall 90.4% of F1 accuracy.
https://doi.org/10.6109/jkiice.2011.15.7.1531 인용 PDF KSCI

A Syllable Kernel based Sentiment Classification for Movie Reviews (음절 커널 기반 영화평 감성 분류)

Kim, Sang-Do;Park, Seong-Bae;Park, Se-Young;Lee, Sang-Jo;Kim, Kweon-Yang
- Journal of the Korean Institute of Intelligent Systems
- /
- v.20 no.2
- /
- pp.202-207
- /
- 2010
In this paper, we present an automatic sentiment classification method for on-line movie reviews that do not contain explicit sentiment rating scores. For the sentiment polarity classification, positive or negative, we use a Support Vector Machine classifier based on syllable kernel that is an extended model of string kernel. We give some experimental results which show that proposed syllable kernel model can be effectively used in sentiment classification tasks for on-line movie reviews that usually contain a lot of grammatical errors such as spacing or spelling errors.
https://doi.org/10.5391/JKIIS.2010.20.2.202 인용 PDF KSCI

Visual Object Tracking Using Superpixel-Based Graph Cuts (슈퍼픽셀 기반의 그래프 컷을 이용한 객체 추적)

Lee, Dae-Youn;Kim, Chang-Su
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2013.06a
- /
- pp.64-65
- /
- 2013
본 논문에서는 슈퍼픽셀(superpixel) 단위의 그래프 컷 알고리즘을 적용하여 객체 추적의 정확도를 향상시키기 위한 방법을 제안한다. 먼저 영상 분할 기법을 사용하여 입력 영상을 슈퍼픽셀로 분할하고 각 슈퍼픽셀에서 색상 히스토그램을 이용한 특성 벡터를 생성한다. 그리고 특성 벡터에 지지벡터기계(support vector machines)를 사용하여 각 슈퍼픽셀의 객체 확률 값을 추정한다. 객체 확률 값을 데이터 항(data term)으로, 이웃한 슈퍼픽셀 간의 특성 벡터 차 값을 스무드 항(smooth term)으로 하여, 그래프 컷(graph cuts) 알고리즘으로 슈퍼픽셀들을 객체와 배경으로 분류하고 객체 슈퍼픽셀을 최대한으로 포함하는 객체 윈도우를 찾는다. 실험 결과는 제안하는 기법이 기존 기법들보다 객체 추적 성능이 우수함을 보여준다.
PDF

An Experimental Study on the Relation Extraction from Biomedical Abstracts using Machine Learning (기계 학습을 이용한 바이오 분야 학술 문헌에서의 관계 추출에 대한 실험적 연구)

Choi, Sung-Pil
- Journal of the Korean Society for Library and Information Science
- /
- v.50 no.2
- /
- pp.309-336
- /
- 2016
This paper introduces a relation extraction system that can be used in identifying and classifying semantic relations between biomedical entities in scientific texts using machine learning methods such as Support Vector Machines (SVM). The suggested system includes many useful functions capable of extracting various linguistic features from sentences having a pair of biomedical entities and applying them into training relation extraction models for maximizing their performance. Three globally representative collections in biomedical domains were used in the experiments which demonstrate its superiority in various biomedical domains. As a result, it is most likely that the intensive experimental study conducted in this paper will provide meaningful foundations for research on bio-text analysis based on machine learning.
https://doi.org/10.4275/KSLIS.2016.50.2.309 인용 PDF KSCI

Effect of an elastic intermediate support on the vibration characteristics of fluid conveying pipes (배관계 진동특성에 미치는 탄성 중간지지대의 영향)

전오성;정진태;이용봉;황철호
- Transactions of the Korean Society of Mechanical Engineers
- /
- v.15 no.6
- /
- pp.1799-1806
- /
- 1991
The effect of an elastic intermediate support on the vibration characteristics of a fluid conveying pipe system modeled as simply-simply supported and fixed-fixed supported pipes has been investigated. The approach is based on solving the closed form equation of the 4th order polynomials. The change of natural frequency and critical velocity are also investigated with the fluid density, the fluid velocity, the position and stiffness of the elastic intermediate support varied. The results show that the vibration characteristics of pipe system could be controled by changing the position and/or stiffness of the elastic intermediate support.
https://doi.org/10.22634/KSME.1991.15.6.1799 인용 PDF

Improving the Performance of SVM Text Categorization with Inter-document Similarities (문헌간 유사도를 이용한 SVM 분류기의 문헌분류성능 향상에 관한 연구)

Lee, Jae-Yun
- Journal of the Korean Society for information Management
- /
- v.22 no.3 s.57
- /
- pp.261-287
- /
- 2005
The purpose of this paper is to explore the ways to improve the performance of SVM (Support Vector Machines) text classifier using inter-document similarities. SVMs are powerful machine learning systems, which are considered as the state-of-the-art technique for automatic document classification. In this paper text categorization via SVMs approach based on feature representation with document vectors is suggested. In this approach, document vectors instead of index terms are used as features, and vector similarities instead of term weights are used as feature values. Experiments show that SVM classifier with document vector features can improve the document classification performance. For the sake of run-time efficiency, two methods are developed: One is to select document vector features, and the other is to use category centroid vector features instead. Experiments on these two methods show that we can get improved performance with small vector feature set than the performance of conventional methods with index term features.
https://doi.org/10.3743/KOSIM.2005.22.3.261 인용 PDF

An Electric Load Forecasting Scheme for University Campus Buildings Using Artificial Neural Network and Support Vector Regression (인공 신경망과 지지 벡터 회귀분석을 이용한 대학 캠퍼스 건물의 전력 사용량 예측 기법)

Moon, Jihoon;Jun, Sanghoon;Park, Jinwoong;Choi, Young-Hwan;Hwang, Eenjun
- KIPS Transactions on Computer and Communication Systems
- /
- v.5 no.10
- /
- pp.293-302
- /
- 2016
Since the electricity is produced and consumed simultaneously, predicting the electric load and securing affordable electric power are necessary for reliable electric power supply. In particular, a university campus is one of the highest power consuming institutions and tends to have a wide variation of electric load depending on time and environment. For these reasons, an accurate electric load forecasting method that can predict power consumption in real-time is required for efficient power supply and management. Even though various influencing factors of power consumption have been discovered for the educational institutions by analyzing power consumption patterns and usage cases, further studies are required for the quantitative prediction of electric load. In this paper, we build an electric load forecasting model by implementing and evaluating various machine learning algorithms. To do that, we consider three building clusters in a campus and collect their power consumption every 15 minutes for more than one year. In the preprocessing, features are represented by considering periodic characteristic of the data and principal component analysis is performed for the features. In order to train the electric load forecasting model, we employ both artificial neural network and support vector machine. We evaluate the prediction performance of each forecasting model by 5-fold cross-validation and compare the prediction result to real electric load.
https://doi.org/10.3745/KTCCS.2016.5.10.293 인용 PDF KSCI

Sentiment Classification of Movie Reviews using Levenshtein Distance (Levenshtein 거리를 이용한 영화평 감성 분류)

Ahn, Kwang-Mo;Kim, Yun-Suk;Kim, Young-Hoon;Seo, Young-Hoon
- Journal of Digital Contents Society
- /
- v.14 no.4
- /
- pp.581-587
- /
- 2013
In this paper, we propose a method of sentiment classification which uses Levenshtein distance. We generate BOW(Bag-Of-Word) applying Levenshtein daistance in sentiment features and used it as the training set. Then the machine learning algorithms we used were SVMs(Support Vector Machines) and NB(Naive Bayes). As the data set, we gather 2,385 reviews of movies from an online movie community (Daum movie service). From the collected reviews, we pick sentiment words up manually and sorted 778 words. In the experiment, we perform the machine learning using previously generated BOW which was applied Levenshtein distance in sentiment words and then we evaluate the performance of classifier by a method, 10-fold-cross validation. As the result of evaluation, we got 85.46% using Multinomial Naive Bayes as the accuracy when the Levenshtein distance was 3. According to the result of the experiment, we proved that it is less affected to performance of the classification in spelling errors in documents.
https://doi.org/10.9728/dcs.2013.14.4.581 인용 PDF KSCI

Search Result 100, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)