Search | Korea Science

A study on data scaling and feature selection techniques for XGBoost-based intrusion detection model (XGBoost 기반 침입탐지모델을 위한 데이터 스케일링 및 특성선택 기법 연구)

Kim, Young-Won;Lee, Soo-Jin
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2022.07a
- /
- pp.251-254
- /
- 2022
본 논문은 XGBoost 알고리즘 기반의 침입탐지모델의 성능을 향상하기 위한 스케일링(scaling) 및 특성선택(feature selection) 기법을 제안한다. 머신러닝 모델 개발 중 전처리 단계에서 스케일링 및 특성선택을 수행하면 데이터세트의 조건수가 감소하여 모델의 성능을 향상할 수 있다. 각 과정별로 다양한 기법이 있지만 기존의 연구에서는 이러한 기법들을 적용한 결과를 비교·분석하지 않고 특정 기법을 적용한 결과만을 나열하였고 스케일링 및 특성선택에 대해 최적의 조합은 제시하지 못하였다. 따라서 본 논문에서는 다양한 전처리 기법들의 적용결과를 비교하고 최적의 조합을 제안한다. 또한 기존의 연구들이 특정 데이터세트에만 적용 가능한 전처리 기법을 제안하는데 비해 본 논문은 다양한 데이터세트에 대해 공통적으로 적용 가능한 전처리 기법을 제안함으로써 제안 기법의 범용성과 실세계 적용 가능성을 증명한다.
PDF

Preprocessing performance of convolutional neural networks according to characteristic of underwater targets (수중 표적 분류를 위한 합성곱 신경망의 전처리 성능 비교)

Kyung-Min, Park;Dooyoung, Kim
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.6
- /
- pp.629-636
- /
- 2022
We present a preprocessing method for an underwater target detection model based on a convolutional neural network. The acoustic characteristics of the ship show ambiguous expression due to the strong signal power of the low frequency. To solve this problem, we combine feature preprocessing methods with various feature scaling methods and spectrogram methods. Define a simple convolutional neural network model and train it to measure preprocessing performance. Through experiment, we found that the combination of log Mel-spectrogram and standardization and robust scaling methods gave the best classification performance.
https://doi.org/10.7776/ASK.2022.41.6.629 인용 PDF KSCI

A Study on Improvement of Speaker Identification with Time axis Scaling (시간축 스케일링에 의한 화자 식별 개선에 관한 연구)

정형교
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06c
- /
- pp.123-126
- /
- 1998
기존의 DTW를 이용한 화자 인식 시스템은 DTW의 단점이라 할 수 있는 과다한 계산량을 갖는다는 문제점을 갖고 있다. 따라서 본 논문은 텍스트 종속 화자 인식 시스템에서 피치 분포도를 갖는 개별 화자의 lDTW를 수행하기 전에 시간축 스케일링을 이용한 전처리로 인식시의 계산량을 감소시키는 과정을 미리 수행할 후 감소된 기준패턴들의 입력신호에 대해서만 DTW를 수행하는 방법을 제안하고자 한다. 제안한 방법을 실험하였을 경우 87.5%의 평균 처리 시간이 감소하였고, 더불어 인식률 감소는 거의 없었다.
PDF

Color Image Scaling Using Oblique Projection (경사 투영을 사용한 컬러 이미지 스케일링)

김준목;정원용
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2000.12a
- /
- pp.53-56
- /
- 2000
본 논문에서는 컬러이미지의 스케일링(scaling)을 위해 경사투영방법을 사용하여 기본적인 보간방법, 최소자승근사(least square approximation)의 결과들과 비교하여 보았다. 경사투영방법은 최소의 근사오차(approximation error)를 제공하는 수직투영(orthogonal projection)방법과 유사한 결과를 제공하며 전처리 필터 디자인에 자유성을 부여하고, 좀 더 일반화된 형태의 보간 방법이다. 사용된 방법을 기본적인 보간법들과 비교하여 보았을 때 더 좋은 PSNR의 결과를 얻을 수 있었으며 최소자승근사 방법과 유사한 결과들을 얻을 수가 있었다.
PDF

Comparison of environmental sound classification performance of convolutional neural networks according to audio preprocessing methods (오디오 전처리 방법에 따른 콘벌루션 신경망의 환경음 분류 성능 비교)

Oh, Wongeun
- The Journal of the Acoustical Society of Korea
- /
- v.39 no.3
- /
- pp.143-149
- /
- 2020
This paper presents the effect of the feature extraction methods used in the audio preprocessing on the classification performance of the Convolutional Neural Networks (CNN). We extract mel spectrogram, log mel spectrogram, Mel Frequency Cepstral Coefficient (MFCC), and delta MFCC from the UrbanSound8K dataset, which is widely used in environmental sound classification studies. Then we scale the data to 3 distributions. Using the data, we test four CNNs, VGG16, and MobileNetV2 networks for performance assessment according to the audio features and scaling. The highest recognition rate is achieved when using the unscaled log mel spectrum as the audio features. Although this result is not appropriate for all audio recognition problems but is useful for classifying the environmental sounds included in the Urbansound8K.
https://doi.org/10.7776/ASK.2020.39.3.143 인용 PDF KSCI

Application of Dimensional Expansion and Reduction to Earthquake Catalog for Machine Learning Analysis (기계학습 분석을 위한 차원 확장과 차원 축소가 적용된 지진 카탈로그)

Jang, Jinsu;So, Byung-Dal
- The Journal of Engineering Geology
- /
- v.32 no.3
- /
- pp.377-388
- /
- 2022
Recently, several studies have utilized machine learning to efficiently and accurately analyze seismic data that are exponentially increasing. In this study, we expand earthquake information such as occurrence time, hypocentral location, and magnitude to produce a dataset for applying to machine learning, reducing the dimension of the expended data into dominant features through principal component analysis. The dimensional extended data comprises statistics of the earthquake information from the Global Centroid Moment Tensor catalog containing 36,699 seismic events. We perform data preprocessing using standard and max-min scaling and extract dominant features with principal components analysis from the scaled dataset. The scaling methods significantly reduced the deviation of feature values caused by different units. Among them, the standard scaling method transforms the median of each feature with a smaller deviation than other scaling methods. The six principal components extracted from the non-scaled dataset explain 99% of the original data. The sixteen principal components from the datasets, which are applied with standardization or max-min scaling, reconstruct 98% of the original datasets. These results indicate that more principal components are needed to preserve original data information with even distributed feature values. We propose a data processing method for efficient and accurate machine learning model to analyze the relationship between seismic data and seismic behavior.
https://doi.org/10.9720/kseg.2022.3.377 인용 PDF KSCI HTML

Implementation of an Efficient Interpolation for CMOS Image Sensor (CMOS 이미지 센서용 효과적인 인터폴레이션 구현)

Lee, Dong-Hun;Sonh, Seung-Il
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- v.9 no.1
- /
- pp.353-357
- /
- 2005
본 논문에서는 영상 입력 장치 또는 카메라 이미지 센서로부터 얻은 Bayer Data입력 포맷을 우리가 디스플레이 장치로 보는 영상으로 출력하기 위해 전처리 작업을 수행한다. 먼저 들어오는 Bayer Data Format은 인터폴레이션을 수행하여 컬러영상을 표현하기위한 한 픽셀 표현 R, G, B값을 구한다. 본 논문에서는 연산량과 필요한 레지스터의 수를 줄이고 칩의 성능을 향상시키기 위해 기존 3${\times}$3라인 쓰지 않고 2${\times}$2라인을 이용한 인터폴레이션을 수행한다. 또한 Bayer Data입력에 대한 이미지 스케일링 작업과 인터폴레이션 수행 작업을 동시에 수행한다. 이를 구현하기위해 원본 이미지 사이즈를 640${\times}$480으로 입력 데이터를 사용하고, 소프트웨어로 전처리하여 이미지 결과를 확인한 후, 최적화된 알고리즘를 적용하여 VHDL설계언어를 이용한 하드웨어 설계후, ModelSim 6.0a를 이용하여 데이터를 검증한다.
PDF

Autoscaling Mechanism based on Execution-times for VNFM in NFV Platforms (NFV 플랫폼에서 VNFM의 실행 시간에 기반한 자동 자원 조정 메커니즘)

Mehmood, Asif;Diaz Rivera, Javier;Khan, Talha Ahmed;Song, Wang-Cheol
- KNOM Review
- /
- v.22 no.1
- /
- pp.1-10
- /
- 2019
The process to determine the required number of resources depends on the factors being considered. Autoscaling is one such mechanism that uses a wide range of factors to decide and is a critical process in NFV. As the networks are being shifted onto the cloud after the invention of SDN, we require better resource managers in the future. To solve this problem, we propose a solution that allows the VNFMs to autoscale the system resources depending on the factors such as overhead of hyperthreading, number of requests, execution-times for the virtual network functions. It is a known fact that the hyperthreaded virtual-cores are not fully capable of performing like the physical cores. Also, as there are different types of core having different frequencies so the process to calculate the number of cores needs to be measured accurately and precisely. The platform independency is achieved by proposing another solution in the form of a monitoring microservice, which communicates through APIs. Hence, by the use of our autoscaling application and a monitoring microservice, we enhance the resource provisioning process to meet the criteria of future networks.
https://doi.org/10.22670/knom.2019.22.1.1 인용

Performance Evaluation of Scaling based Dynamic Time Warping Algorithms for the Detection of Low-rate TCP Attacks (Low-rate TCP 공격 탐지를 위한 스케일링 기반 DTW 알고리즘의 성능 분석)

So, Won-Ho;Shim, Sang-Heon;Yoo, Kyoung-Min;Kim, Young-Chon
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.44 no.3 s.357
- /
- pp.33-40
- /
- 2007
In this paper, low-rate TCP attack as one of shrew attacks is considered and the scaling based dynamic time warping (S-DTW) algorithm is introduced. The low-rate TCP attack can not be detected by the detection method for the previous flooding DoS/DDoS (Denial of Service/Distirbuted Denial of Service) attacks due to its low average traffic rate. It, however, is a periodic short burst that exploits the homogeneity of the minimum retransmission timeout (RTO) of TCP flows and then some pattern matching mechanisms have been proposed to detect it among legitimate input flows. A DTW mechanism as one of detection approaches has proposed to detect attack input stream consisting of many legitimate or attack flows, and shown a depending method as well. This approach, however, has a problem that legitimate input stream may be caught as an attack one. In addition, it is difficult to decide a threshold for separation between the legitimate and the malicious. Thus, the causes of this problem are analyzed through simulation and the scaling by maximum auto-correlation value is executed before computing the DTW. We also discuss the results on applying various scaling approaches and using standard deviation of input streams monitored.
PDF KSCI

A Study on the Selection of Types of Social Disasters by Region (시·도별 사회재난 중점유형 선정에 관한 연구)

Lee, Hyo Jin;Yun, Hong Sic;Han, Hak
- Journal of the Society of Disaster Information
- /
- v.17 no.2
- /
- pp.206-217
- /
- 2021
Purpose: Recently, a series of large social disasters have led to a lot of research to prevent social disasters as well as natural disasters and reduce damage. However, this paper aims to select the types of social disasters that local governments should focus on and create basic data for effective countermeasures and mitigation efforts. Method: Among 43 types of disasters announced by the Ministry of Public Administration and Security, 11 types of disasters were selected and collected to select the main types of disasters, and risk types were derived by region with risk maps. In order to derive the risk map, each detailed index was rescheduled to be 0-1 and weights were determined through entropy technique. Result: As a result, about 41% of the major disasters announced by the Ministry of Public Administration and Security were consistent, and the rest of the major types were disasters that could not be obtained or have not occurred in the past 20 years. Conclusion: Therefore, in order to establish an effective prevention and recovery plan for social disasters through this study, it was intended to present social disaster-focused disasters for each local government.
https://doi.org/10.15683/kosdi.2021.6.30.206 인용 PDF KSCI

Search Result 18, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)