• Title/Summary/Keyword: Classification accuracy

Search Result 3,065, Processing Time 0.027 seconds

Clasification of Cyber Attack Group using Scikit Learn and Cyber Treat Datasets (싸이킷런과 사이버위협 데이터셋을 이용한 사이버 공격 그룹의 분류)

  • Kim, Kyungshin;Lee, Hojun;Kim, Sunghee;Kim, Byungik;Na, Wonshik;Kim, Donguk;Lee, Jeongwhan
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.6
    • /
    • pp.165-171
    • /
    • 2018
  • The most threatening attack that has become a hot topic of recent IT security is APT Attack.. So far, there is no way to respond to APT attacks except by using artificial intelligence techniques. Here, we have implemented a machine learning algorithm for analyzing cyber threat data using machine learning method, using a data set that collects cyber attack cases using Scikit Learn, a big data machine learning framework. The result showed an attack classification accuracy close to 70%. This result can be developed into the algorithm of the security control system in the future.

Performance Evaluation of a Machine Learning Model Based on Data Feature Using Network Data Normalization Technique (네트워크 데이터 정형화 기법을 통한 데이터 특성 기반 기계학습 모델 성능평가)

  • Lee, Wooho;Noh, BongNam;Jeong, Kimoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.785-794
    • /
    • 2019
  • Recently Deep Learning technology, one of the fourth industrial revolution technologies, is used to identify the hidden meaning of network data that is difficult to detect in the security arena and to predict attacks. Property and quality analysis of data sources are required before selecting the deep learning algorithm to be used for intrusion detection. This is because it affects the detection method depending on the contamination of the data used for learning. Therefore, the characteristics of the data should be identified and the characteristics selected. In this paper, the characteristics of malware were analyzed using network data set and the effect of each feature on performance was analyzed when the deep learning model was applied. The traffic classification experiment was conducted on the comparison of characteristics according to network characteristics and 96.52% accuracy was classified based on the selected characteristics.

A Study on Malware Identification System Using Static Analysis Based Machine Learning Technique (정적 분석 기반 기계학습 기법을 활용한 악성코드 식별 시스템 연구)

  • Kim, Su-jeong;Ha, Ji-hee;Oh, Soo-hyun;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.775-784
    • /
    • 2019
  • Malware infringement attacks are continuously increasing in various environments such as mobile, IOT, windows and mac due to the emergence of new and variant malware, and signature-based countermeasures have limitations in detection of malware. In addition, analytical performance is deteriorating due to obfuscation, packing, and anti-VM technique. In this paper, we propose a system that can detect malware based on machine learning by using similarity hashing-based pattern detection technique and static analysis after file classification according to packing. This enables more efficient detection because it utilizes both pattern-based detection, which is well-known malware detection, and machine learning-based detection technology, which is advantageous for detecting new and variant malware. The results of this study were obtained by detecting accuracy of 95.79% or more for benign sample files and malware sample files provided by the AI-based malware detection track of the Information Security R&D Data Challenge 2018 competition. In the future, it is expected that it will be possible to build a system that improves detection performance by applying a feature vector and a detection method to the characteristics of a packed file.

Performance Comparison of Machine Learning Algorithms for TAB Digit Recognition (타브 숫자 인식을 위한 기계 학습 알고리즘의 성능 비교)

  • Heo, Jaehyeok;Lee, Hyunjung;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.1
    • /
    • pp.19-26
    • /
    • 2019
  • In this paper, the classification performance of learning algorithms is compared for TAB digit recognition. The TAB digits that are segmented from TAB musical notes contain TAB lines and musical symbols. The labeling method and non-linear filter are designed and applied to extract fret digits only. The shift operation of the 4 directions is applied to generate more data. The selected models are Bayesian classifier, support vector machine, prototype based learning, multi-layer perceptron, and convolutional neural network. The result shows that the mean accuracy of the Bayesian classifier is about 85.0% while that of the others reaches more than 99.0%. In addition, the convolutional neural network outperforms the others in terms of generalization and the step of the data preprocessing.

Therapeutic Robot Action Design for ASD Children Using Speech Data (음성 정보를 이용한 자폐아 치료용 로봇의 동작 설계)

  • Lee, Jin-Gyu;Lee, Bo-Hee
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1123-1130
    • /
    • 2018
  • A cat robot for the Autism Spectrum Disorders(ASD) treatment was designed and conducted field test. The designed robot had emotion expressing action through interaction by the touch, and performed a reasonable emotional expression based on Artificial Neural Network(ANN). However these operations were difficult to use in the various healing activities. In this paper, we describe a motion design that can be used in a variety of contexts and flexibly reaction with various kinds of situations. As a necessary element, the speech recognition system using the speech data collection method and ANN was suggested and the classification results were analyzed after experiment. This ANN will be improved through collecting various voice data to raise the accuracy in the future and checked the effectiveness through field test.

Crack Detection of Concrete Structure Using Deep Learning and Image Processing Method in Geotechnical Engineering (딥러닝과 영상처리기법을 이용한 콘크리트 지반 구조물 균열 탐지)

  • Kim, Ah-Ram;Kim, Donghyeon;Byun, Yo-Seph;Lee, Seong-Won
    • Journal of the Korean Geotechnical Society
    • /
    • v.34 no.12
    • /
    • pp.145-154
    • /
    • 2018
  • The damage investigation and inspection methods performed in concrete facilities such as bridges, tunnels, retaining walls and so on, are usually visually examined by the inspector using the surveying tool in the field. These methods highly depend on the subjectivity of the inspector, which may reduce the objectivity and reliability of the record. Therefore, the new image processing techniques are necessary in order to automatically detect the cracks and objectively analyze the characteristics of cracks. In this study, deep learning and image processing technique were developed to detect cracks and analyze characteristics in images for concrete facilities. Two-stage image processing pipeline was proposed to obtain crack segmentation and its characteristics. The performance of the method was tested using various crack images with a label and the results showed over 90% of accuracy on crack classification and segmentation. Finally, the crack characteristics (length and thickness) of the crack image pictured from the field were analyzed, and the performance of the developed technique was verified by comparing the actual measured values and errors.

A Practical Implementation of Deep Learning Method for Supporting the Classification of Breast Lesions in Ultrasound Images

  • Han, Seokmin;Lee, Suchul;Lee, Jun-Rak
    • International journal of advanced smart convergence
    • /
    • v.8 no.1
    • /
    • pp.24-34
    • /
    • 2019
  • In this research, a practical deep learning framework to differentiate the lesions and nodules in breast acquired with ultrasound imaging has been proposed. 7408 ultrasound breast images of 5151 patient cases were collected. All cases were biopsy proven and lesions were semi-automatically segmented. To compensate for the shift caused in the segmentation, the boundaries of each lesion were drawn using Fully Convolutional Networks(FCN) segmentation method based on the radiologist's specified point. The data set consists of 4254 benign and 3154 malignant lesions. In 7408 ultrasound breast images, the number of training images is 6579, and the number of test images is 829. The margin between the boundary of each lesion and the boundary of the image itself varied for training image augmentation. The training images were augmented by varying the margin between the boundary of each lesion and the boundary of the image itself. The images were processed through histogram equalization, image cropping, and margin augmentation. The networks trained on the data with augmentation and the data without augmentation all had AUC over 0.95. The network exhibited about 90% accuracy, 0.86 sensitivity and 0.95 specificity. Although the proposed framework still requires to point to the location of the target ROI with the help of radiologists, the result of the suggested framework showed promising results. It supports human radiologist to give successful performance and helps to create a fluent diagnostic workflow that meets the fundamental purpose of CADx.

The Application of the Next-generation Medium Satellite C-band Radar Images in Environmental Field Works

  • Han, Hyeon-gyeong;Lee, Moungjin
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.4
    • /
    • pp.617-623
    • /
    • 2019
  • Numerous water disasters have recently occurred all over the world, including South Korea, due to global climate change in recent years. As water-related disasters occur extensively and their sites are difficult for people to access, it is necessary to monitor them using satellites. The Ministry of Environment and K-water plan to launch the next-generation medium satellite No. 5 (water resource/water disaster satellite) equipped with C-band synthetic aperture radar (SAR) in 2025. C-band SAR has the advantage of being able to observe water resources twice a day at a high resolution both day and night, regardless of weather conditions. Currently, RADARSAT-2 and Sentinel-1 equipped with C-band SAR achieve the purpose of their launch and are used in various environmental fields such as forest structure detection and coastline change monitoring, as well as for unique purposes including the detection of flooding, drought and soil moisture change, utilizing the advantages of SAR. As such, this study aimed to analyze the characteristics of the next-generation medium satellite No. 5 and its application in environmental fields. Our findings showed that it can be used to improve the degree of precision of existing environmental spatial information such as the classification accuracy of land cover map in environmental field works. It also enables us to observe forests and water resources in North Korea that are difficult to access geographically. It is ultimately expected that this will enable the monitoring of the whole Korean Peninsula in various environmental fields, and help in relevant responses and policy supports.

Prediction of replacement period of shield TBM disc cutter using SVM (SVM 기법을 이용한 쉴드 TBM 디스크 커터 교환 주기 예측)

  • La, You-Sung;Kim, Myung-In;Kim, Bumjoo
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.21 no.5
    • /
    • pp.641-656
    • /
    • 2019
  • In this study, a machine learning method was proposed to use in predicting optimal replacement period of shield TBM (Tunnel Boring Machine) disc cutter. To do this, a large dataset of ground condition, disc cutter replacement records and TBM excavation-related data, collected from a shield TBM tunnel site in Korea, was built and they were used to construct a disc cutter replacement period prediction model using a machine learning algorithm, SVM (Support Vector Machine) and to assess the performance of the model. The results showed that the performance of RBF (Radial Basis Function) SVM is the best among a total of three SVM classification functions (80% accuracy and 10% error rate on average). When compared between ground types, the more disc cutter replacement data existed, the better prediction results were obtained. From this results, it is expected that machine learning methods become very popularly used in practice in near future as more data is accumulated and the machine learning models continue to be fine-tuned.

Prediction for Periodontal Disease using Gene Expression Profile Data based on Machine Learning (기계학습 기반 유전자 발현 데이터를 이용한 치주질환 예측)

  • Rhee, Je-Keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.8
    • /
    • pp.903-909
    • /
    • 2019
  • Periodontal disease is observed in many adult persons. However we has not clear know the molecular mechanism and how to treat the disease at the molecular levels. Here, we investigated the molecular differences between periodontal disease and normal controls using gene expression data. In particular, we checked whether the periodontal disease and normal tissues would be classified by machine learning algorithms using gene expression data. Moreover, we revealed the differentially expression genes and their function. As a result, we revealed that the periodontal disease and normal control samples were clearly clustered. In addition, by applying several classification algorithms, such as decision trees, random forests, support vector machines, the two samples were classified well with high accuracy, sensitivity and specificity, even though the dataset was imbalanced. Finally, we found that the genes which were related to inflammation and immune response, were usually have distinct patterns between the two classes.