• Title/Summary/Keyword: 러닝센터

Search Result 208, Processing Time 0.03 seconds

A Study on Dynamic Resource Management Based on K-Means Clustering in Cloud Computing (K-Means Clustering 알고리즘 기반 클라우드 동적 자원 관리 기법에 관한 연구)

  • Kwak, Minki;Yu, Heonchang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.107-110
    • /
    • 2021
  • 글로벌 퍼블릭 클라우드 산업 규모는 매년 폭발적으로 성장하고 있으며 최근 COVID-19 등 비대면 문화 확산에 따라 지속 확장되고 있다. 클라우드 사업자는 유한한 인프라 자원으로 다수의 사용자에게 양질의 IT 서비스 제공을 위해 잉여 자원 할당을 최소화하는 것이 중요하다. 그러나 일반적인 퍼블릭 클라우드 환경에서는 정적 자원 할당 기법을 채택하고 있기 때문에 사용자의 주관적인 판단에 따라 잉여 자원의 발생은 필연적이다. 본 논문에서는 머신 러닝 기법 중 K-Means Clustering 알고리즘을 적용하여 클라우드 동적 자원 관리 기법을 제안한다. K-Means Clustering 기반으로 클라우드에 탑재된 각 Instance 의 자원 사용률 데이터를 분석하고, 분석 결과를 토대로 각 Instance 가 속한 Cluster 에 대하여 자원 최적화 작업을 수행한다. 이를 통해 전체 데이터센터 관점에서 잉여 자원의 발생을 최소화하면서도 SLA 수준 및 서비스 연속성을 보장한다.

A Deep Learning Application for Automated Feature Extraction in Transaction-based Machine Learning (트랜잭션 기반 머신러닝에서 특성 추출 자동화를 위한 딥러닝 응용)

  • Woo, Deock-Chae;Moon, Hyun Sil;Kwon, Suhnbeom;Cho, Yoonho
    • Journal of Information Technology Services
    • /
    • v.18 no.2
    • /
    • pp.143-159
    • /
    • 2019
  • Machine learning (ML) is a method of fitting given data to a mathematical model to derive insights or to predict. In the age of big data, where the amount of available data increases exponentially due to the development of information technology and smart devices, ML shows high prediction performance due to pattern detection without bias. The feature engineering that generates the features that can explain the problem to be solved in the ML process has a great influence on the performance and its importance is continuously emphasized. Despite this importance, however, it is still considered a difficult task as it requires a thorough understanding of the domain characteristics as well as an understanding of source data and the iterative procedure. Therefore, we propose methods to apply deep learning for solving the complexity and difficulty of feature extraction and improving the performance of ML model. Unlike other techniques, the most common reason for the superior performance of deep learning techniques in complex unstructured data processing is that it is possible to extract features from the source data itself. In order to apply these advantages to the business problems, we propose deep learning based methods that can automatically extract features from transaction data or directly predict and classify target variables. In particular, we applied techniques that show high performance in existing text processing based on the structural similarity between transaction data and text data. And we also verified the suitability of each method according to the characteristics of transaction data. Through our study, it is possible not only to search for the possibility of automated feature extraction but also to obtain a benchmark model that shows a certain level of performance before performing the feature extraction task by a human. In addition, it is expected that it will be able to provide guidelines for choosing a suitable deep learning model based on the business problem and the data characteristics.

Proposed TATI Model for Predicting the Traffic Accident Severity (교통사고 심각 정도 예측을 위한 TATI 모델 제안)

  • Choo, Min-Ji;Park, So-Hyun;Park, Young-Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.8
    • /
    • pp.301-310
    • /
    • 2021
  • The TATI model is a Traffic Accident Text to RGB Image model, which is a methodology proposed in this paper for predicting the severity of traffic accidents. Traffic fatalities are decreasing every year, but they are among the low in the OECD members. Many studies have been conducted to reduce the death rate of traffic accidents, and among them, studies have been steadily conducted to reduce the incidence and mortality rate by predicting the severity of traffic accidents. In this regard, research has recently been active to predict the severity of traffic accidents by utilizing statistical models and deep learning models. In this paper, traffic accident dataset is converted to color images to predict the severity of traffic accidents, and this is done via CNN models. For performance comparison, we experiment that train the same data and compare the prediction results with the proposed model and other models. Through 10 experiments, we compare the accuracy and error range of four deep learning models. Experimental results show that the accuracy of the proposed model was the highest at 0.85, and the second lowest error range at 0.03 was shown to confirm the superiority of the performance.

Detecting Similar Designs Using Deep Learning-based Image Feature Extracting Model (딥러닝 기반 이미지 특징 추출 모델을 이용한 유사 디자인 검출에 대한 연구)

  • Lee, Byoung Woo;Lee, Woo Chang;Chae, Seung Wan;Kim, Dong Hyun;Lee, Choong Kwon
    • Smart Media Journal
    • /
    • v.9 no.4
    • /
    • pp.162-169
    • /
    • 2020
  • Design is a key factor that determines the competitiveness of products in the textile and fashion industry. It is very important to measure the similarity of the proposed design in order to prevent unauthorized copying and to confirm the originality. In this study, a deep learning technique was used to quantify features from images of textile designs, and similarity was measured using Spearman correlation coefficients. To verify that similar samples were actually detected, 300 images were randomly rotated and color changed. The results of Top-3 and Top-5 in the order of similarity value were measured to see if samples that rotated or changed color were detected. As a result, the VGG-16 model recorded significantly higher performance than did AlexNet. The performance of the VGG-16 model was the highest at 64% and 73.67% in the Top-3 and Top-5, where similarity results were high in the case of the rotated image. appear. In the case of color change, the highest in Top-3 and Top-5 at 86.33% and 90%, respectively.

Approach to Improving the Performance of Network Intrusion Detection by Initializing and Updating the Weights of Deep Learning (딥러닝의 가중치 초기화와 갱신에 의한 네트워크 침입탐지의 성능 개선에 대한 접근)

  • Park, Seongchul;Kim, Juntae
    • Journal of the Korea Society for Simulation
    • /
    • v.29 no.4
    • /
    • pp.73-84
    • /
    • 2020
  • As the Internet began to become popular, there have been hacking and attacks on networks including systems, and as the techniques evolved day by day, it put risks and burdens on companies and society. In order to alleviate that risk and burden, it is necessary to detect hacking and attacks early and respond appropriately. Prior to that, it is necessary to increase the reliability in detecting network intrusion. This study was conducted on applying weight initialization and weight optimization to the KDD'99 dataset to improve the accuracy of detecting network intrusion. As for the weight initialization, it was found through experiments that the initialization method related to the weight learning structure, like Xavier and He method, affects the accuracy. In addition, the weight optimization was confirmed through the experiment of the network intrusion detection dataset that the Adam algorithm, which combines the advantages of the Momentum reflecting the previous change and RMSProp, which allows the current weight to be reflected in the learning rate, stands out in terms of accuracy.

Proactive Virtual Network Function Live Migration using Machine Learning (머신러닝을 이용한 선제적 VNF Live Migration)

  • Jeong, Seyeon;Yoo, Jae-Hyoung;Hong, James Won-Ki
    • KNOM Review
    • /
    • v.24 no.1
    • /
    • pp.1-12
    • /
    • 2021
  • VM (Virtual Machine) live migration is a server virtualization technique for deploying a running VM to another server node while minimizing downtime of a service the VM provides. Currently, in cloud data centers, VM live migration is widely used to apply load balancing on CPU workload and network traffic, to reduce electricity consumption by consolidating active VMs into specific location groups of servers, and to provide uninterrupted service during the maintenance of hardware and software update on servers. It is critical to use VMlive migration as a prevention or mitigation measure for possible failure when its indications are detected or predicted. In this paper, we propose two VNF live migration methods; one for predictive load balancing and the other for a proactive measure in failure. Both need machine learning models that learn periodic monitoring data of resource usage and logs from servers and VMs/VNFs. We apply the second method to a vEPC (Virtual Evolved Pakcet Core) failure scenario to provide a detailed case study.

Comparison of Chlorophyll-a Prediction and Analysis of Influential Factors in Yeongsan River Using Machine Learning and Deep Learning (머신러닝과 딥러닝을 이용한 영산강의 Chlorophyll-a 예측 성능 비교 및 변화 요인 분석)

  • Sun-Hee, Shim;Yu-Heun, Kim;Hye Won, Lee;Min, Kim;Jung Hyun, Choi
    • Journal of Korean Society on Water Environment
    • /
    • v.38 no.6
    • /
    • pp.292-305
    • /
    • 2022
  • The Yeongsan River, one of the four largest rivers in South Korea, has been facing difficulties with water quality management with respect to algal bloom. The algal bloom menace has become bigger, especially after the construction of two weirs in the mainstream of the Yeongsan River. Therefore, the prediction and factor analysis of Chlorophyll-a (Chl-a) concentration is needed for effective water quality management. In this study, Chl-a prediction model was developed, and the performance evaluated using machine and deep learning methods, such as Deep Neural Network (DNN), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost). Moreover, the correlation analysis and the feature importance results were compared to identify the major factors affecting the concentration of Chl-a. All models showed high prediction performance with an R2 value of 0.9 or higher. In particular, XGBoost showed the highest prediction accuracy of 0.95 in the test data.The results of feature importance suggested that Ammonia (NH3-N) and Phosphate (PO4-P) were common major factors for the three models to manage Chl-a concentration. From the results, it was confirmed that three machine learning methods, DNN, RF, and XGBoost are powerful methods for predicting water quality parameters. Also, the comparison between feature importance and correlation analysis would present a more accurate assessment of the important major factors.

Energy harvesting characteristics on curvature based PVDF cantilever energy harvester due to vortex induced vibration (곡면을 가진 외팔보형 PVDF 에너지 하베스터의 와류유기진동으로 인한 에너지 수확 특성)

  • Woo-Jin Song;Jongkil Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.168-177
    • /
    • 2024
  • When designing an underwater Piezoelectric Energy Harvester (PEH), Vortex Induced Vibration (VIV) is generated throughout the cantilever through a change in curvature, and the generation of VIV increases the vibration displacement of the curved cantilever PEH, which is an important factor in increasing actual power. The material of the curved PEH selected a Polyvinyline Di-Floride (PVDF) piezoelectric film, and the flow velocity is set at 0.1 m/s to 0.50 m/s for 50 mm, 130 mm, and 210 mm with various curvatures. The strain energy change of PEH by VIV was observed. The smaller the radius of curvature, the larger the VIV, and as the flow rate increased, more VIV appeared. Rapid shape transformation due to the small curvature was effective in generating VIV, and strain energy, normalized voltage, average power, etc. To increase the amount of power of the PEH, it is considered that the average power will increase as the number of curved PEHs increases as well as the steep curvature is improved.

Development of a complex failure prediction system using Hierarchical Attention Network (Hierarchical Attention Network를 이용한 복합 장애 발생 예측 시스템 개발)

  • Park, Youngchan;An, Sangjun;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.127-148
    • /
    • 2020
  • The data center is a physical environment facility for accommodating computer systems and related components, and is an essential foundation technology for next-generation core industries such as big data, smart factories, wearables, and smart homes. In particular, with the growth of cloud computing, the proportional expansion of the data center infrastructure is inevitable. Monitoring the health of these data center facilities is a way to maintain and manage the system and prevent failure. If a failure occurs in some elements of the facility, it may affect not only the relevant equipment but also other connected equipment, and may cause enormous damage. In particular, IT facilities are irregular due to interdependence and it is difficult to know the cause. In the previous study predicting failure in data center, failure was predicted by looking at a single server as a single state without assuming that the devices were mixed. Therefore, in this study, data center failures were classified into failures occurring inside the server (Outage A) and failures occurring outside the server (Outage B), and focused on analyzing complex failures occurring within the server. Server external failures include power, cooling, user errors, etc. Since such failures can be prevented in the early stages of data center facility construction, various solutions are being developed. On the other hand, the cause of the failure occurring in the server is difficult to determine, and adequate prevention has not yet been achieved. In particular, this is the reason why server failures do not occur singularly, cause other server failures, or receive something that causes failures from other servers. In other words, while the existing studies assumed that it was a single server that did not affect the servers and analyzed the failure, in this study, the failure occurred on the assumption that it had an effect between servers. In order to define the complex failure situation in the data center, failure history data for each equipment existing in the data center was used. There are four major failures considered in this study: Network Node Down, Server Down, Windows Activation Services Down, and Database Management System Service Down. The failures that occur for each device are sorted in chronological order, and when a failure occurs in a specific equipment, if a failure occurs in a specific equipment within 5 minutes from the time of occurrence, it is defined that the failure occurs simultaneously. After configuring the sequence for the devices that have failed at the same time, 5 devices that frequently occur simultaneously within the configured sequence were selected, and the case where the selected devices failed at the same time was confirmed through visualization. Since the server resource information collected for failure analysis is in units of time series and has flow, we used Long Short-term Memory (LSTM), a deep learning algorithm that can predict the next state through the previous state. In addition, unlike a single server, the Hierarchical Attention Network deep learning model structure was used in consideration of the fact that the level of multiple failures for each server is different. This algorithm is a method of increasing the prediction accuracy by giving weight to the server as the impact on the failure increases. The study began with defining the type of failure and selecting the analysis target. In the first experiment, the same collected data was assumed as a single server state and a multiple server state, and compared and analyzed. The second experiment improved the prediction accuracy in the case of a complex server by optimizing each server threshold. In the first experiment, which assumed each of a single server and multiple servers, in the case of a single server, it was predicted that three of the five servers did not have a failure even though the actual failure occurred. However, assuming multiple servers, all five servers were predicted to have failed. As a result of the experiment, the hypothesis that there is an effect between servers is proven. As a result of this study, it was confirmed that the prediction performance was superior when the multiple servers were assumed than when the single server was assumed. In particular, applying the Hierarchical Attention Network algorithm, assuming that the effects of each server will be different, played a role in improving the analysis effect. In addition, by applying a different threshold for each server, the prediction accuracy could be improved. This study showed that failures that are difficult to determine the cause can be predicted through historical data, and a model that can predict failures occurring in servers in data centers is presented. It is expected that the occurrence of disability can be prevented in advance using the results of this study.

Remote Sensing based Algae Monitoring in Dams using High-resolution Satellite Image and Machine Learning (고해상도 위성영상과 머신러닝을 활용한 녹조 모니터링 기법 연구)

  • Jung, Jiyoung;Jang, Hyeon June;Kim, Sung Hoon;Choi, Young Don;Yi, Hye-Suk;Choi, Sunghwa
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.42-42
    • /
    • 2022
  • 지금까지도 유역에서의 녹조 모니터링은 현장채수를 통한 점 단위 모니터링에 크게 의존하고 있어 기후, 유속, 수온조건 등에 따라 수체에 광범위하게 발생하는 녹조를 효율적으로 모니터링하고 대응하기에는 어려운 점들이 있어왔다. 또한, 그동안 제한된 관측 데이터로 인해 현장 측정된 실측 데이터 보다는 녹조와 관련이 높은 NDVI, FGAI, SEI 등의 파생적인 지수를 산정하여 원격탐사자료와 매핑하는 방식의 분석연구 등이 선행되었다. 본 연구는 녹조의 모니터링시 정확도와 효율성을 향상을 목표로 하여, 우선은 녹조 측정장비를 활용, 7000개 이상의 녹조 관측 데이터를 확보하였으며, 이를 바탕으로 동기간의 고해상도 위성 자료와 실측자료를 매핑하기 위해 다양한Machine Learning기법을 적용함으로써 그 효과성을 검토하고자 하였다. 연구대상지는 낙동강 내성천 상류에 위치한 영주댐 유역으로서 데이터 수집단계에서는 면단위 현장(in-situ) 관측을 위해 2020년 2~9월까지 4회에 걸쳐 7291개의 녹조를 측정하고, 동일 시간 및 공간의 Sentinel-2자료 중 Band 1~12까지 총 13개(Band 8은 8과 8A로 2개)의 분광특성자료를 추출하였다. 다음으로 Machine Learning 분석기법의 적용을 위해 algae_monitoring Python library를 구축하였다. 개발된 library는 1) Training Set과 Test Set의 구분을 위한 Data 준비단계, 2) Random Forest, Gradient Boosting Regression, XGBoosting 알고리즘 중 선택하여 적용할 수 있는 모델적용단계, 3) 모델적용결과를 확인하는 Performance test단계(R2, MSE, MAE, RMSE, NSE, KGE 등), 4) 모델결과의 Visualization단계, 5) 선정된 모델을 활용 위성자료를 녹조값으로 변환하는 적용단계로 구분하여 영주댐뿐만 아니라 다양한 유역에 범용적으로 적용할 수 있도록 구성하였다. 본 연구의 사례에서는 Sentinel-2위성의 12개 밴드, 기상자료(대기온도, 구름비율) 총 14개자료를 활용하여 Machine Learning기법 중 Random Forest를 적용하였을 경우에, 전반적으로 가장 높은 적합도를 나타내었으며, 적용결과 Test Set을 기준으로 NSE(Nash Sutcliffe Efficiency)가 0.96(Training Set의 경우에는 0.99) 수준의 성능을 나타내어, 광역적인 위성자료와 충분히 확보된 현장실측 자료간의 데이터 학습을 통해서 조류 모니터링 분석의 효율성이 획기적으로 증대될 수 있음을 확인하였다.

  • PDF