• Title/Summary/Keyword: 시계열 분류

Search Result 283, Processing Time 0.031 seconds

IoT Malware Detection and Family Classification Using Entropy Time Series Data Extraction and Recurrent Neural Networks (엔트로피 시계열 데이터 추출과 순환 신경망을 이용한 IoT 악성코드 탐지와 패밀리 분류)

  • Kim, Youngho;Lee, Hyunjong;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.5
    • /
    • pp.197-202
    • /
    • 2022
  • IoT (Internet of Things) devices are being attacked by malware due to many security vulnerabilities, such as the use of weak IDs/passwords and unauthenticated firmware updates. However, due to the diversity of CPU architectures, it is difficult to set up a malware analysis environment and design features. In this paper, we design time series features using the byte sequence of executable files to represent independent features of CPU architectures, and analyze them using recurrent neural networks. The proposed feature is a fixed-length time series pattern extracted from the byte sequence by calculating partial entropy and applying linear interpolation. Temporary changes in the extracted feature are analyzed by RNN and LSTM. In the experiment, the IoT malware detection showed high performance, while low performance was analyzed in the malware family classification. When the entropy patterns for each malware family were compared visually, the Tsunami and Gafgyt families showed similar patterns, resulting in low performance. LSTM is more suitable than RNN for learning temporal changes in the proposed malware features.

Discrimination between trend and difference stationary processes based on adaptive lasso (Adaptive lasso를 이용하여 추세-정상시계열과 차분-정상시계열을 판별하는 방법에 대한 연구)

  • Na, Okyoung
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.723-738
    • /
    • 2020
  • In this paper, we study a method to discriminate between trend stationary and difference stationary processes. Since a crucial ingredient of this discrimination is to determine the existence of unit root, we can use a unit root testing strategy. So, we introduce a discrimination based on unit root testing and propose the method using the adaptive lasso. Our Monte Carlo simulation experiments show that the adaptive lasso improves the discrimination accuracy when the process is trend stationary, but has lower accuracy than unit root strategy where the process is difference stationary.

Passive sonar signal classification using attention based gated recurrent unit (어텐션 기반 게이트 순환 유닛을 이용한 수동소나 신호분류)

  • Kibae Lee;Guhn Hyeok Ko;Chong Hyun Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.345-356
    • /
    • 2023
  • Target signal of passive sonar shows narrow band harmonic characteristic with a variation in intensity within a few seconds and long term frequency variation due to the Lloyd's mirror effect. We propose a signal classification algorithm based on Gated Recurrent Unit (GRU) that learns local and global time series features. The algorithm proposed implements a multi layer network using GRU and extracts local and global time series features via dilated connections. We learns attention mechanism to weight time series features and classify passive sonar signals. In experiments using public underwater acoustic data, the proposed network showed superior classification accuracy of 96.50 %. This result is 4.17 % higher classification accuracy compared to existing skip connected GRU network.

Enhancing Classification Performance of Temporal Keyword Data by Using Moving Average-based Dynamic Time Warping Method (이동 평균 기반 동적 시간 와핑 기법을 이용한 시계열 키워드 데이터의 분류 성능 개선 방안)

  • Jeong, Do-Heon
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.83-105
    • /
    • 2019
  • This study aims to suggest an effective method for the automatic classification of keywords with similar patterns by calculating pattern similarity of temporal data. For this, large scale news on the Web were collected and time series data composed of 120 time segments were built. To make training data set for the performance test of the proposed model, 440 representative keywords were manually classified according to 8 types of trend. This study introduces a Dynamic Time Warping(DTW) method which have been commonly used in the field of time series analytics, and proposes an application model, MA-DTW based on a Moving Average(MA) method which gives a good explanation on a tendency of trend curve. As a result of the automatic classification by a k-Nearest Neighbor(kNN) algorithm, Euclidean Distance(ED) and DTW showed 48.2% and 66.6% of maximum micro-averaged F1 score respectively, whereas the proposed model represented 74.3% of the best micro-averaged F1 score. In all respect of the comprehensive experiments, the suggested model outperformed the methods of ED and DTW.

Selection of a Mother Wavelet Using Wavelet Analysis of Time Series Data (시계열 자료의 웨이블릿 분석을 위한 모 웨이블릿의 선정문제)

  • Lee, Hyunwook;Song, Sunguk;Zhu, Ju Hua;Lee, Munseok;Yoo, Chulsang
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.259-259
    • /
    • 2019
  • 시계열 자료들을 분석하고자 하는 경우 자료가 정상성(stationarity)을 만족하는 경우는 드물다. 특히 계절성을 제거한 자료들에서는 정량화하기 어려운 주기성이 많이 관찰된다. 즉, 어떤 특정지역에서 나타나는 현상이 다른 기상 현상에 영향을 미칠 것은 자명한 일이나 그 관련성이 선형(linearity)일 가능성은 극히 드물다. 따라서 그들 사이의 관련성이 선형성에 근거한 지표들로 정량화되어야 한다. 이러한 문제점을 해결하기 위해서 다양한 방법이 사용되며 그중에서 웨이블릿 분석을 통해 본 연구를 진행하였다. 웨이블릿 변환(wavelet transforms)은 특수한 함수의 집합으로 구성되어 기존 웨이블릿 신호의 분석을 위해 사용되는 방법이다. 이 변환은 푸리에 변환에서 변형된 방법으로 특정한 기저 함수(base function)를 이용하여 기존의 시계열 자료를 주파수로 바꾸는 변환이다. 웨이블릿 변환에서 기저 함수를 모 웨이블릿이라고 하며 이를 천이, 확대 및 축소 과정을 통해 주파수를 구성한다. 웨이블릿 분석은 모 웨이블릿을 분해하고 재결합하여 시계열 분석을 할 수 있다. 모 웨이블릿 함수에는 Haar, Daubechies, Coiflets, Symlets, Morlet, Mexican Hat, Meyer 등의 여러 가지 종류의 모 웨이블릿 함수가 있으며 모 웨이블릿이 달라지면 결과가 다르게 나타난다. 기존에는 Morlet 웨이블릿을 주로 이용하여 주파수분석에 사용하여 결과를 도출하였다. 그리고 시계열 자료는 크게 백색잡음(White Noise), 장기기억(Long Term Memory), 단기기억(Short Term Memory)으로 나뉜다. 각 시계열 자료의 종류에 따라 임의의 시계열 자료를 산정하여 그에 따른 웨이블릿 분석을 통해 모 웨이블릿의 특성을 도출하였다. 본 연구에서는 웨이블릿 분석을 통해 시계열 자료의 최적 모 웨이블릿을 결정하고자 남방진동지수(SOI), 북극진동지수(AOI)의 자료를 이용하여 웨이블릿 분석을 시도하였다. 웨이블릿 분석은 모 웨이블릿에 따라 달라지는 결과를 토대로 분석하였으며 이를 정상성과 지속성에 따라 분류된 시계열에 적용하여 최적 모 웨이블릿을 결정하고자 하였다. 본 연구에서는 임의의 시계열 자료에서 설정한 최적의 모 웨이블릿을 AOI와 SOI와 같은 실제 시계열 자료에 대입하여 분석을 진행하였다. 본 연구에서는 시계열 자료의 종류를 구분하고 자료의 특성에 따라 가장 적합한 모 웨이블릿을 구하고자 하였다.

  • PDF

EEG Classification using Time-series Learning Algorithm (시계열 학습 알고리즘을 이용한 뇌파 자동 분류)

  • Kim, Jong-Hwan;Nam, Sang-Ha;Kim, In-Cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.240-243
    • /
    • 2013
  • 본 논문에서는 로봇 제어 목적의 응용을 위해 SVM 알고리즘과 HMM 알고리즘을 근간으로 하는 효과적인 뇌파 데이터 자동 분류 방법을 제안한다. Emotive Epoc 헤드셋 뇌파 측정 장비를 이용하여 뇌파 데이터를 수집하고, 수집된 뇌파 데이터로부터 FFT알고리즘을 이용하여 특징 추출을 수행한다. 그리고 SVM 알고리즘을 이용한 1단계 분류 방법과 SVM 알고리즘의 분류 결과를 다시 입력 시퀀스로 삼아 시계열 학습 알고리즘인 HMM에 적용하는 2단계 분류 방법의 실험 결과를 소개한다.

Clustering and classification to characterize daily electricity demand (시간단위 전력사용량 시계열 패턴의 군집 및 분류분석)

  • Park, Dain;Yoon, Sanghoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.395-406
    • /
    • 2017
  • The purpose of this study is to identify the pattern of daily electricity demand through clustering and classification. The hourly data was collected by KPS (Korea Power Exchange) between 2008 and 2012. The time trend was eliminated for conducting the pattern of daily electricity demand because electricity demand data is times series data. We have considered k-means clustering, Gaussian mixture model clustering, and functional clustering in order to find the optimal clustering method. The classification analysis was conducted to understand the relationship between external factors, day of the week, holiday, and weather. Data was divided into training data and test data. Training data consisted of external factors and clustered number between 2008 and 2011. Test data was daily data of external factors in 2012. Decision tree, random forest, Support vector machine, and Naive Bayes were used. As a result, Gaussian model based clustering and random forest showed the best prediction performance when the number of cluster was 8.

Classification of Precipitation Data Based on Smoothed Periodogram (평활된 주기도를 이용한 강수량자료의 군집화)

  • Park, Man-Sik;Kim, Hee-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.547-560
    • /
    • 2008
  • It is well known that spectral density function determines auto-covariance function of stationary time-series data and smoothed periodogram is a consistent estimator of spectral density function. Recently, Kim and Park (2007) showed that smoothed- periodogram based distances performs very well for the classification. In this paper, we introduce classification methods with smoothed periodogram and apply the approaches to the monthly precipitation measurements obtained from January, 1987 through December, 2007 at 22 locations in South Korea.

Classification of Transport Vehicle Noise Events in Magnetotelluric Time Series Data in an Urban area Using Random Forest Techniques (Random Forest 기법을 이용한 도심지 MT 시계열 자료의 차량 잡음 분류)

  • Kwon, Hyoung-Seok;Ryu, Kyeongho;Sim, Ickhyeon;Lee, Choon-Ki;Oh, Seokhoon
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.4
    • /
    • pp.230-242
    • /
    • 2020
  • We performed a magnetotelluric (MT) survey to delineate the geological structures below the depth of 20 km in the Gyeongju area where an earthquake with a magnitude of 5.8 occurred in September 2016. The measured MT data were severely distorted by electrical noise caused by subways, power lines, factories, houses, and farmlands, and by vehicle noise from passing trains and large trucks. Using machine-learning methods, we classified the MT time series data obtained near the railway and highway into two groups according to the inclusion of traffic noise. We applied three schemes, stochastic gradient descent, support vector machine, and random forest, to the time series data for the highspeed train noise. We formulated three datasets, Hx, Hy, and Hx & Hy, for the time series data of the large truck noise and applied the random forest method to each dataset. To evaluate the effect of removing the traffic noise, we compared the time series data, amplitude spectra, and apparent resistivity curves before and after removing the traffic noise from the time series data. We also examined the frequency range affected by traffic noise and whether artifact noise occurred during the traffic noise removal process as a result of the residual difference.

A Rule-Based Image Classification Method for Analysis of Urban Development in the Capital Area (수도권 도시개발 분석을 위한 규칙기반 영상분류)

  • Lee, Jin-A;Lee, Sung-Soon
    • Spatial Information Research
    • /
    • v.19 no.6
    • /
    • pp.43-54
    • /
    • 2011
  • This study proposes a rule-based image classification method for the time-series analysis of changes in the land surface of the Seongnam-Yongin area using satellite-image data from 2000 to 2009. In order to identify the change patterns during each period, 11 classes were employed in accordance with statistical/mathematic rules. A generalized algorithm was used so that the rules could be applied to the unsupervised-classification method that does not establish any training sites. The results showed that the urban area of the object increased by 145% due to housing-site development. The image data from 2009 had a classification accuracy of 98%. For method verification, the results were compared to land-cover changes through Post-classification comparison. The maximum utilization of the available data within multiple images and the optimized classification allowed for an improvement in the classification accuracy. The proposed rule-based image-classification method is expected to be widely employed for the time-series analysis of images to produce a thematic map for urban development and to monitor urban development and environmental change.