• Title/Summary/Keyword: 학습 기반 필터링 기법

Search Result 66, Processing Time 0.026 seconds

Distributed Processing System Design and Implementation for Feature Extraction from Large-Scale Malicious Code (대용량 악성코드의 특징 추출 가속화를 위한 분산 처리 시스템 설계 및 구현)

  • Lee, Hyunjong;Euh, Seongyul;Hwang, Doosung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.2
    • /
    • pp.35-40
    • /
    • 2019
  • Traditional Malware Detection is susceptible for detecting malware which is modified by polymorphism or obfuscation technology. By learning patterns that are embedded in malware code, machine learning algorithms can detect similar behaviors and replace the current detection methods. Data must collected continuously in order to learn malicious code patterns that change over time. However, the process of storing and processing a large amount of malware files is accompanied by high space and time complexity. In this paper, an HDFS-based distributed processing system is designed to reduce space complexity and accelerate feature extraction time. Using a distributed processing system, we extract two API features based on filtering basis, 2-gram feature and APICFG feature and the generalization performance of ensemble learning models is compared. In experiments, the time complexity of the feature extraction was improved about 3.75 times faster than the processing time of a single computer, and the space complexity was about 5 times more efficient. The 2-gram feature was the best when comparing the classification performance by feature, but the learning time was long due to high dimensionality.

Residual Convolutional Recurrent Neural Network-Based Sound Event Classification Applicable to Broadcast Captioning Services (자막방송을 위한 잔차 합성곱 순환 신경망 기반 음향 사건 분류)

  • Kim, Nam Kyun;Kim, Hong Kook;Ahn, Chung Hyun
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.26-27
    • /
    • 2021
  • 본 논문에서는 자막방송 제공을 위해 방송콘텐츠를 이해하는 방법으로 잔차 합성곱 순환신경망 기반 음향 사건 분류 기법을 제안한다. 제안된 기법은 잔차 합성곱 신경망과 순환 신경망을 연결한 구조를 갖는다. 신경망의 입력 특징으로는 멜-필터벵크 특징을 활용하고, 잔차 합성곱 신경망은 하나의 스템 블록과 5개의 잔차 합성곱 신경망으로 구성된다. 잔차 합성곱 신경망은 잔차 학습으로 구성된 합성곱 신경망과 기존의 합성곱 신경망 대비 특징맵의 표현 능력 향상을 위해 합성곱 블록 주의 모듈로 구성한다. 추출된 특징맵은 순환 신경망에 연결되고, 최종적으로 음향 사건 종류와 시간정보를 추출하는 완전연결층으로 연결되는 구조를 활용한다. 제안된 모델 훈련을 위해 라벨링되지 않는 데이터 활용이 가능한 평균 교사 모델을 기반으로 훈련하였다. 제안된 모델의 성능평가를 위해 DCASE 2020 챌린지 Task 4 데이터 셋을 활용하였으며, 성능 평가 결과 46.8%의 이벤트 단위의 F1-score를 얻을 수 있었다.

  • PDF

Geographical Name Denoising by Machine Learning of Event Detection Based on Twitter (트위터 기반 이벤트 탐지에서의 기계학습을 통한 지명 노이즈제거)

  • Woo, Seungmin;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.10
    • /
    • pp.447-454
    • /
    • 2015
  • This paper proposes geographical name denoising by machine learning of event detection based on twitter. Recently, the increasing number of smart phone users are leading the growing user of SNS. Especially, the functions of short message (less than 140 words) and follow service make twitter has the power of conveying and diffusing the information more quickly. These characteristics and mobile optimised feature make twitter has fast information conveying speed, which can play a role of conveying disasters or events. Related research used the individuals of twitter user as the sensor of event detection to detect events that occur in reality. This research employed geographical name as the keyword by using the characteristic that an event occurs in a specific place. However, it ignored the denoising of relationship between geographical name and homograph, it became an important factor to lower the accuracy of event detection. In this paper, we used removing and forecasting, these two method to applied denoising technique. First after processing the filtering step by using noise related database building, we have determined the existence of geographical name by using the Naive Bayesian classification. Finally by using the experimental data, we earned the probability value of machine learning. On the basis of forecast technique which is proposed in this paper, the reliability of the need for denoising technique has turned out to be 89.6%.

A Study on Orthogonal Image Detection Precision Improvement Using Data of Dead Pine Trees Extracted by Period Based on U-Net model (U-Net 모델에 기반한 기간별 추출 소나무 고사목 데이터를 이용한 정사영상 탐지 정밀도 향상 연구)

  • Kim, Sung Hun;Kwon, Ki Wook;Kim, Jun Hyun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.4
    • /
    • pp.251-260
    • /
    • 2022
  • Although the number of trees affected by pine wilt disease is decreasing, the affected area is expanding across the country. Recently, with the development of deep learning technology, it is being rapidly applied to the detection study of pine wilt nematodes and dead trees. The purpose of this study is to efficiently acquire deep learning training data and acquire accurate true values to further improve the detection ability of U-Net models through learning. To achieve this purpose, by using a filtering method applying a step-by-step deep learning algorithm the ambiguous analysis basis of the deep learning model is minimized, enabling efficient analysis and judgment. As a result of the analysis the U-Net model using the true values analyzed by period in the detection and performance improvement of dead pine trees of wilt nematode using the U-Net algorithm had a recall rate of -0.5%p than the U-Net model using the previously provided true values, precision was 7.6%p and F-1 score was 4.1%p. In the future, it is judged that there is a possibility to increase the precision of wilt detection by applying various filtering techniques, and it is judged that the drone surveillance method using drone orthographic images and artificial intelligence can be used in the pine wilt nematode disaster prevention project.

Estimating Gastrointestinal Transition Location Using CNN-based Gastrointestinal Landmark Classifier (CNN 기반 위장관 랜드마크 분류기를 이용한 위장관 교차점 추정)

  • Jang, Hyeon Woong;Lim, Chang Nam;Park, Ye-Suel;Lee, Gwang Jae;Lee, Jung-Won
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.3
    • /
    • pp.101-108
    • /
    • 2020
  • Since the performance of deep learning techniques has recently been proven in the field of image processing, there are many attempts to perform classification, analysis, and detection of images using such techniques in various fields. Among them, the expectation of medical image analysis software, which can serve as a medical diagnostic assistant, is increasing. In this study, we are attention to the capsule endoscope image, which has a large data set and takes a long time to judge. The purpose of this paper is to distinguish the gastrointestinal landmarks and to estimate the gastrointestinal transition location that are common to all patients in the judging of capsule endoscopy and take a lot of time. To do this, we designed CNN-based Classifier that can identify gastrointestinal landmarks, and used it to estimate the gastrointestinal transition location by filtering the results. Then, we estimate gastrointestinal transition location about seven of eight patients entered the suspected gastrointestinal transition area. In the case of change from the stomach to the small intestine(pylorus), and change from the small intestine to the large intestine(ileocecal valve), we can check all eight patients were found to be in the suspected gastrointestinal transition area. we can found suspected gastrointestinal transition area in the range of 100 frames, and if the reader plays images at 10 frames per second, the gastrointestinal transition could be found in 10 seconds.

Implementation of DTW-kNN-based Decision Support System for Discriminating Emerging Technologies (DTW-kNN 기반의 유망 기술 식별을 위한 의사결정 지원 시스템 구현 방안)

  • Jeong, Do-Heon;Park, Ju-Yeon
    • Journal of Industrial Convergence
    • /
    • v.20 no.8
    • /
    • pp.77-84
    • /
    • 2022
  • This study aims to present a method for implementing a decision support system that can be used for selecting emerging technologies by applying a machine learning-based automatic classification technique. To conduct the research, the architecture of the entire system was built and detailed research steps were conducted. First, emerging technology candidate items were selected and trend data was automatically generated using a big data system. After defining the conceptual model and pattern classification structure of technological development, an efficient machine learning method was presented through an automatic classification experiment. Finally, the analysis results of the system were interpreted and methods for utilization were derived. In a DTW-kNN-based classification experiment that combines the Dynamic Time Warping(DTW) method and the k-Nearest Neighbors(kNN) classification model proposed in this study, the identification performance was up to 87.7%, and particularly in the 'eventual' section where the trend highly fluctuates, the maximum performance difference was 39.4% points compared to the Euclidean Distance(ED) algorithm. In addition, through the analysis results presented by the system, it was confirmed that this decision support system can be effectively utilized in the process of automatically classifying and filtering by type with a large amount of trend data.

Recognition of Resident Registration Card using ART2-based RBF Network and face Verification (ART2 기반 RBF 네트워크와 얼굴 인증을 이용한 주민등록증 인식)

  • Kim Kwang-Baek;Kim Young-Ju
    • Journal of Intelligence and Information Systems
    • /
    • v.12 no.1
    • /
    • pp.1-15
    • /
    • 2006
  • In Korea, a resident registration card has various personal information such as a present address, a resident registration number, a face picture and a fingerprint. A plastic-type resident card currently used is easy to forge or alter and tricks of forgery grow to be high-degree as time goes on. So, whether a resident card is forged or not is difficult to judge by only an examination with the naked eye. This paper proposed an automatic recognition method of a resident card which recognizes a resident registration number by using a refined ART2-based RBF network newly proposed and authenticates a face picture by a template image matching method. The proposed method, first, extracts areas including a resident registration number and the date of issue from a resident card image by applying Sobel masking, median filtering and horizontal smearing operations to the image in turn. To improve the extraction of individual codes from extracted areas, the original image is binarized by using a high-frequency passing filter and CDM masking is applied to the binaried image fur making image information of individual codes better. Lastly, individual codes, which are targets of recognition, are extracted by applying 4-directional contour tracking algorithm to extracted areas in the binarized image. And this paper proposed a refined ART2-based RBF network to recognize individual codes, which applies ART2 as the loaming structure of the middle layer and dynamicaly adjusts a teaming rate in the teaming of the middle and the output layers by using a fuzzy control method to improve the performance of teaming. Also, for the precise judgement of forgey of a resident card, the proposed method supports a face authentication by using a face template database and a template image matching method. For performance evaluation of the proposed method, this paper maked metamorphoses of an original image of resident card such as a forgey of face picture, an addition of noise, variations of contrast variations of intensity and image blurring, and applied these images with original images to experiments. The results of experiment showed that the proposed method is excellent in the recognition of individual codes and the face authentication fur the automatic recognition of a resident card.

  • PDF

SVM-Based EEG Signal for Hand Gesture Classification (서포트 벡터 머신 기반 손동작 뇌전도 구분에 대한 연구)

  • Hong, Seok-min;Min, Chang-gi;Oh, Ha-Ryoung;Seong, Yeong-Rak;Park, Jun-Seok
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.29 no.7
    • /
    • pp.508-514
    • /
    • 2018
  • An electroencephalogram (EEG) evaluates the electrical activity generated by brain cell interactions that occur during brain activity, and an EEG can evaluate the brain activity caused by hand movement. In this study, a 16-channel EEG was used to measure the EEG generated before and after hand movement. The measured data can be classified as a supervised learning model, a support vector machine (SVM). To shorten the learning time of the SVM, a feature extraction and vector dimension reduction by filtering is proposed that minimizes motion-related information loss and compresses EEG information. The classification results showed an average of 72.7% accuracy between the sitting position and the hand movement at the electrodes of the frontal lobe.

Comparison of CNN and GAN-based Deep Learning Models for Ground Roll Suppression (그라운드-롤 제거를 위한 CNN과 GAN 기반 딥러닝 모델 비교 분석)

  • Sangin Cho;Sukjoon Pyun
    • Geophysics and Geophysical Exploration
    • /
    • v.26 no.2
    • /
    • pp.37-51
    • /
    • 2023
  • The ground roll is the most common coherent noise in land seismic data and has an amplitude much larger than the reflection event we usually want to obtain. Therefore, ground roll suppression is a crucial step in seismic data processing. Several techniques, such as f-k filtering and curvelet transform, have been developed to suppress the ground roll. However, the existing methods still require improvements in suppression performance and efficiency. Various studies on the suppression of ground roll in seismic data have recently been conducted using deep learning methods developed for image processing. In this paper, we introduce three models (DnCNN (De-noiseCNN), pix2pix, and CycleGAN), based on convolutional neural network (CNN) or conditional generative adversarial network (cGAN), for ground roll suppression and explain them in detail through numerical examples. Common shot gathers from the same field were divided into training and test datasets to compare the algorithms. We trained the models using the training data and evaluated their performances using the test data. When training these models with field data, ground roll removed data are required; therefore, the ground roll is suppressed by f-k filtering and used as the ground-truth data. To evaluate the performance of the deep learning models and compare the training results, we utilized quantitative indicators such as the correlation coefficient and structural similarity index measure (SSIM) based on the similarity to the ground-truth data. The DnCNN model exhibited the best performance, and we confirmed that other models could also be applied to suppress the ground roll.

Natural Language Processing Model for Data Visualization Interaction in Chatbot Environment (챗봇 환경에서 데이터 시각화 인터랙션을 위한 자연어처리 모델)

  • Oh, Sang Heon;Hur, Su Jin;Kim, Sung-Hee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.11
    • /
    • pp.281-290
    • /
    • 2020
  • With the spread of smartphones, services that want to use personalized data are increasing. In particular, healthcare-related services deal with a variety of data, and data visualization techniques are used to effectively show this. As data visualization techniques are used, interactions in visualization are also naturally emphasized. In the PC environment, since the interaction for data visualization is performed with a mouse, various filtering for data is provided. On the other hand, in the case of interaction in a mobile environment, the screen size is small and it is difficult to recognize whether or not the interaction is possible, so that only limited visualization provided by the app can be provided through a button touch method. In order to overcome the limitation of interaction in such a mobile environment, we intend to enable data visualization interactions through conversations with chatbots so that users can check individual data through various visualizations. To do this, it is necessary to convert the user's query into a query and retrieve the result data through the converted query in the database that is storing data periodically. There are many studies currently being done to convert natural language into queries, but research on converting user queries into queries based on visualization has not been done yet. Therefore, in this paper, we will focus on query generation in a situation where a data visualization technique has been determined in advance. Supported interactions are filtering on task x-axis values and comparison between two groups. The test scenario utilized data on the number of steps, and filtering for the x-axis period was shown as a bar graph, and a comparison between the two groups was shown as a line graph. In order to develop a natural language processing model that can receive requested information through visualization, about 15,800 training data were collected through a survey of 1,000 people. As a result of algorithm development and performance evaluation, about 89% accuracy in classification model and 99% accuracy in query generation model was obtained.