• 제목/요약/키워드: data pre-processing

검색결과 809건 처리시간 0.025초

High Rate Denial-of-Service Attack Detection System for Cloud Environment Using Flume and Spark

  • Gutierrez, Janitza Punto;Lee, Kilhung
    • Journal of Information Processing Systems
    • /
    • 제17권4호
    • /
    • pp.675-689
    • /
    • 2021
  • Nowadays, cloud computing is being adopted for more organizations. However, since cloud computing has a virtualized, volatile, scalable and multi-tenancy distributed nature, it is challenging task to perform attack detection in the cloud following conventional processes. This work proposes a solution which aims to collect web server logs by using Flume and filter them through Spark Streaming in order to only consider suspicious data or data related to denial-of-service attacks and reduce the data that will be stored in Hadoop Distributed File System for posterior analysis with the frequent pattern (FP)-Growth algorithm. With the proposed system, we can address some of the difficulties in security for cloud environment, facilitating the data collection, reducing detection time and consequently enabling an almost real-time attack detection.

드론 스트리밍 영상 이미지 분석을 통한 실시간 산불 탐지 시스템 (Forest Fire Detection System using Drone Streaming Images)

  • Yoosin Kim
    • 한국항행학회논문지
    • /
    • 제27권5호
    • /
    • pp.685-689
    • /
    • 2023
  • The proposed system in the study aims to detect forest fires in real-time stream data received from the drone-camera. Recently, the number of wildfires has been increasing, and also the large scaled wildfires are frequent more and more. In order to prevent forest fire damage, many experiments using the drone camera and vision analysis are actively conducted, however there were many challenges, such as network speed, pre-processing, and model performance, to detect forest fires from real-time streaming data of the flying drone. Therefore, this study applied image data processing works to capture five good image frames for vision analysis from whole streaming data and then developed the object detection model based on YOLO_v2. As the result, the classification model performance of forest fire images reached upto 93% of accuracy, and the field test for the model verification detected the forest fire with about 70% accuracy.

심층신경망을 이용한 소스 코드 원작자 식별 (Souce Code Identification Using Deep Neural Network)

  • 임지수
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제8권9호
    • /
    • pp.373-378
    • /
    • 2019
  • 현재 프로그래밍 소스들이 온라인에서 공개되어 있기 때문에 무분별한 표절이나 저작권에 대한 문제가 일어나고 있다. 그 중 반복된 저자가 작성한 소스코드는 프로그래밍 특성상 고유의 지문이 있을 수 있다. 본 논문은 구글 코드 잼 프로그램 소스를 심층신경망을 이용한 학습을 통해 각각의 저자를 분별하는 것이다. 이 때 원작자의 소스를 예측 기반 벡터나, 주파수 기반 접근법인 TF-IDF등의 전처리기를 사용하여 입력값들을 벡터화해주고, 심층신경망을 이용한 학습을 통해 각 프로그램 소스 원작자를 식별하고자 한다. 전처리기를 이용하여 언어에 독립적인 학습시스템을 구성하고, 기존의 다른 학습 방법들과 비교하였다. 그 중 TF-IDF와 심층신경망을 사용한 모델은 다른 전처리기나 다른 학습방식을 사용한 것보다 좋은 성능을 보임을 확인하였다.

선체의 태양복사 열변형 해석을 위한 전처리시스템 (A System for Thermal Distortion Analysis of Hull Structures by Solar Radiation)

  • 하윤석;이동훈
    • 대한조선학회논문집
    • /
    • 제53권4호
    • /
    • pp.275-281
    • /
    • 2016
  • One of the most important things for quality to meet ship-production schedule is an accuracy control. A ship is assembled by welding through whole production process, so it is important that loss by correction will not happen as much as possible by using some engineering skills like reverse design, reverse setting and margin for thermal shrinkage. These efforts are a quite effective in fabrication stages, but not in erection stages. If a ship block which consists of common steel is exposed to directional solar radiation, its dimensional accuracy will change high as time by its thermal expansion coefficient. Therefore, the measuring work would be often done at dawn or evening even with having a very accurate device. In this study, an FE analysis method is developed to solve this problem. It can change measured data affected by solar thermal distortion to ones not, even though ship-block is measured at an arbitrary time. It will use the time when measuring, the direction of block and the weather record by satellites. It is confirmed by a comparison between measured data of a ship-block and the result by suggested analysis method. Furthermore, a pre-processing system is also developed for fast application of the suggested analysis method.

A Design and Implementation of Missing Person Identification System using face Recognition

  • Shin, Jong-Hwan;Park, Chan-Mi;Lee, Heon-Ju;Lee, Seoung-Hyeon;Lee, Jae-Kwang
    • 한국컴퓨터정보학회논문지
    • /
    • 제26권2호
    • /
    • pp.19-25
    • /
    • 2021
  • 본 논문에서는 비전 기술과 딥러닝 기반의 얼굴인식을 통해 실종자를 식별하는 방법을 제안하였다. 모바일 디바이스에서 전송된 원본 이미지에 대해 얼굴인식에 적합하도록 이미지를 전처리한 후, 얼굴인식의 정확도 향상을 위한 이미지 데이터 증식과 CNN 기반 얼굴학습 및 검증을 통해 실종자를 인식하였다. 본 논문의 구현 결과를 이용하여 가상의 실종자 이미지를 식별한 결과, 원본 데이터와 블러 처리한 데이터를 함께 학습한 모델의 성능이 가장 우수하게 나왔다. 또한 사전학습된 가중치를 사용한 학습 모델은 사용하지 않은 모델보다 높은 성능을 보였지만, 편향과 분산이 높게 나오는 한계를 확인할 수 있었다.

Effective Pre-rating Method Based on Users' Dichotomous Preferences and Average Ratings Fusion for Recommender Systems

  • Cheng, Shulin;Wang, Wanyan;Yang, Shan;Cheng, Xiufang
    • Journal of Information Processing Systems
    • /
    • 제17권3호
    • /
    • pp.462-472
    • /
    • 2021
  • With an increase in the scale of recommender systems, users' rating data tend to be extremely sparse. Some methods have been utilized to alleviate this problem; nevertheless, it has not been satisfactorily solved yet. Therefore, we propose an effective pre-rating method based on users' dichotomous preferences and average ratings fusion. First, based on a user-item ratings matrix, a new user-item preference matrix was constructed to analyze and model user preferences. The items were then divided into two categories based on a parameterized dynamic threshold. The missing ratings for items that the user was not interested in were directly filled with the lowest user rating; otherwise, fusion ratings were utilized to fill the missing ratings. Further, an optimized parameter λ was introduced to adjust their weights. Finally, we verified our method on a standard dataset. The experimental results show that our method can effectively reduce the prediction error and improve the recommendation quality. As for its application, our method is effective, but not complicated.

한국 남성의 고혈압에 대한 특징 선택 기반 위험 예측 (Feature selection-based Risk Prediction for Hypertension in Korean men)

  • 홍고르출;김미혜
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2021년도 춘계학술발표대회
    • /
    • pp.323-325
    • /
    • 2021
  • In this article, we have improved the prediction of hypertension detection using the feature selection method for the Korean national health data named by the KNHANES database. The study identified a variety of risk factors associated with chronic hypertension. The paper is divided into two modules. The first of these is a data pre-processing step that uses a factor analysis (FA) based feature selection method from the dataset. The next module applies a predictive analysis step to detect and predict hypertension risk prediction. In this study, we compare the mean standard error (MSE), F1-score, and area under the ROC curve (AUC) for each classification model. The test results show that the proposed FIFA-OE-NB algorithm has an MSE, F1-score, and AUC outcomes 0.259, 0.460, and 64.70%, respectively. These results demonstrate that the proposed FIFA-OE method outperforms other models for hypertension risk predictions.

Investigation of light stimulated mouse brain activation in high magnetic field fMRI using image segmentation methods

  • Kim, Wook;Woo, Sang-Keun;Kang, Joo Hyun;Lim, Sang Moo
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권12호
    • /
    • pp.11-18
    • /
    • 2016
  • Magnetic resonance image (MRI) is widely used in brain research field and medical image. Especially, non-invasive brain activation acquired image technique, which is functional magnetic resonance image (fMRI) is used in brain study. In this study, we investigate brain activation occurred by LED light stimulation. For investigate of brain activation in experimental small animal, we used high magnetic field 9.4T MRI. Experimental small animal is Balb/c mouse, method of fMRI is using echo planar image (EPI). EPI method spend more less time than any other MRI method. For this reason, however, EPI data has low contrast. Due to the low contrast, image pre-processing is very hard and inaccuracy. In this study, we planned the study protocol, which is called block design in fMRI research field. The block designed has 8 LED light stimulation session and 8 rest session. All block is consist of 6 EPI images and acquired 1 slice of EPI image is 16 second. During the light session, we occurred LED light stimulation for 1 minutes 36 seconds. During the rest session, we do not occurred light stimulation and remain the light off state for 1 minutes 36 seconds. This session repeat the all over the EPI scan time, so the total spend time of EPI scan has almost 26 minutes. After acquired EPI data, we performed the analysis of this image data. In this study, we analysis of EPI data using statistical parametric map (SPM) software and performed image pre-processing such as realignment, co-registration, normalization, smoothing of EPI data. The pre-processing of fMRI data have to segmented using this software. However this method has 3 different method which is Gaussian nonparametric, warped modulate, and tissue probability map. In this study we performed the this 3 different method and compared how they can change the result of fMRI analysis results. The result of this study show that LED light stimulation was activate superior colliculus region in mouse brain. And the most higher activated value of segmentation method was using tissue probability map. this study may help to improve brain activation study using EPI and SPM analysis.

Rough Set Theory와 Support Vector Machine 알고리즘을 이용한 RSIDS 설계 (A Design of RSIDS using Rough Set Theory and Support Vector Machine Algorithm)

  • 이병관;정은희
    • 한국컴퓨터정보학회논문지
    • /
    • 제17권12호
    • /
    • pp.179-185
    • /
    • 2012
  • 본 논문에서는 RST(Rough Set Theory)과 SVM(Support Vector Machine) 알고리즘을 이용한 RSIDS (RST and SVM based Intrusion Detection System)를 설계하였다. RSIDS는 PrePro(Preprocessing) 모듈, RRG(RST based Rule Generation) 모듈, 그리고 SAD(SVM based Attack Detection) 모듈로 구성된다. PrePro 모듈은 수집한 정보를 RSIDS의 데이터 형식에 맞게 변경한다. RRG 모듈은 공격 자료를 분석하여 공격 규칙을 생성하고, 그 규칙을 이용하여 대량화된 데이터에서 공격정보를 추출하고, 그리고 추출한 공격정보를 SAD 모듈에 전달한다. SAD 모듈은 추출된 공격 정보를 이용하여 공격을 탐지하여 관리자에게 통보한다. 그 결과, 기존의 SVM과 비교해볼 때, RSIDS는 평균 공격 탐지율 77.71%에서 85.28%로 향상되었으며, 평균 FPR은 13.25%에서 9.87%로 감소하였다. 따라서 RSIDS는 기존의 SVM을 이용한 공격 탐지 기법보다 향상되었다고 할 수 있다.

New Medical Image Fusion Approach with Coding Based on SCD in Wireless Sensor Network

  • Zhang, De-gan;Wang, Xiang;Song, Xiao-dong
    • Journal of Electrical Engineering and Technology
    • /
    • 제10권6호
    • /
    • pp.2384-2392
    • /
    • 2015
  • The technical development and practical applications of big-data for health is one hot topic under the banner of big-data. Big-data medical image fusion is one of key problems. A new fusion approach with coding based on Spherical Coordinate Domain (SCD) in Wireless Sensor Network (WSN) for big-data medical image is proposed in this paper. In this approach, the three high-frequency coefficients in wavelet domain of medical image are pre-processed. This pre-processing strategy can reduce the redundant ratio of big-data medical image. Firstly, the high-frequency coefficients are transformed to the spherical coordinate domain to reduce the correlation in the same scale. Then, a multi-scale model product (MSMP) is used to control the shrinkage function so as to make the small wavelet coefficients and some noise removed. The high-frequency parts in spherical coordinate domain are coded by improved SPIHT algorithm. Finally, based on the multi-scale edge of medical image, it can be fused and reconstructed. Experimental results indicate the novel approach is effective and very useful for transmission of big-data medical image(especially, in the wireless environment).