• Title/Summary/Keyword: Mixed algorithm

Search Result 884, Processing Time 0.021 seconds

Development of a complex failure prediction system using Hierarchical Attention Network (Hierarchical Attention Network를 이용한 복합 장애 발생 예측 시스템 개발)

  • Park, Youngchan;An, Sangjun;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.127-148
    • /
    • 2020
  • The data center is a physical environment facility for accommodating computer systems and related components, and is an essential foundation technology for next-generation core industries such as big data, smart factories, wearables, and smart homes. In particular, with the growth of cloud computing, the proportional expansion of the data center infrastructure is inevitable. Monitoring the health of these data center facilities is a way to maintain and manage the system and prevent failure. If a failure occurs in some elements of the facility, it may affect not only the relevant equipment but also other connected equipment, and may cause enormous damage. In particular, IT facilities are irregular due to interdependence and it is difficult to know the cause. In the previous study predicting failure in data center, failure was predicted by looking at a single server as a single state without assuming that the devices were mixed. Therefore, in this study, data center failures were classified into failures occurring inside the server (Outage A) and failures occurring outside the server (Outage B), and focused on analyzing complex failures occurring within the server. Server external failures include power, cooling, user errors, etc. Since such failures can be prevented in the early stages of data center facility construction, various solutions are being developed. On the other hand, the cause of the failure occurring in the server is difficult to determine, and adequate prevention has not yet been achieved. In particular, this is the reason why server failures do not occur singularly, cause other server failures, or receive something that causes failures from other servers. In other words, while the existing studies assumed that it was a single server that did not affect the servers and analyzed the failure, in this study, the failure occurred on the assumption that it had an effect between servers. In order to define the complex failure situation in the data center, failure history data for each equipment existing in the data center was used. There are four major failures considered in this study: Network Node Down, Server Down, Windows Activation Services Down, and Database Management System Service Down. The failures that occur for each device are sorted in chronological order, and when a failure occurs in a specific equipment, if a failure occurs in a specific equipment within 5 minutes from the time of occurrence, it is defined that the failure occurs simultaneously. After configuring the sequence for the devices that have failed at the same time, 5 devices that frequently occur simultaneously within the configured sequence were selected, and the case where the selected devices failed at the same time was confirmed through visualization. Since the server resource information collected for failure analysis is in units of time series and has flow, we used Long Short-term Memory (LSTM), a deep learning algorithm that can predict the next state through the previous state. In addition, unlike a single server, the Hierarchical Attention Network deep learning model structure was used in consideration of the fact that the level of multiple failures for each server is different. This algorithm is a method of increasing the prediction accuracy by giving weight to the server as the impact on the failure increases. The study began with defining the type of failure and selecting the analysis target. In the first experiment, the same collected data was assumed as a single server state and a multiple server state, and compared and analyzed. The second experiment improved the prediction accuracy in the case of a complex server by optimizing each server threshold. In the first experiment, which assumed each of a single server and multiple servers, in the case of a single server, it was predicted that three of the five servers did not have a failure even though the actual failure occurred. However, assuming multiple servers, all five servers were predicted to have failed. As a result of the experiment, the hypothesis that there is an effect between servers is proven. As a result of this study, it was confirmed that the prediction performance was superior when the multiple servers were assumed than when the single server was assumed. In particular, applying the Hierarchical Attention Network algorithm, assuming that the effects of each server will be different, played a role in improving the analysis effect. In addition, by applying a different threshold for each server, the prediction accuracy could be improved. This study showed that failures that are difficult to determine the cause can be predicted through historical data, and a model that can predict failures occurring in servers in data centers is presented. It is expected that the occurrence of disability can be prevented in advance using the results of this study.

Determinants of Consumer Preference by type of Accommodation: Two Step Cluster Analysis (이단계 군집분석에 의한 농촌관광 편의시설 유형별 소비자 선호 결정요인)

  • Park, Duk-Byeong;Yoon, Yoo-Shik;Lee, Min-Soo
    • Journal of Global Scholars of Marketing Science
    • /
    • v.17 no.3
    • /
    • pp.1-19
    • /
    • 2007
  • 1. Purpose Rural tourism is made by individuals with different characteristics, needs and wants. It is important to have information on the characteristics and preferences of the consumers of the different types of existing rural accommodation. The stud aims to identify the determinants of consumer preference by type of accommodations. 2. Methodology 2.1 Sample Data were collected from 1000 people by telephone survey with three-stage stratified random sampling in seven metropolitan areas in Korea. Respondents were chosen by sampling internal on telephone book published in 2006. We surveyed from four to ten-thirty 0'clock afternoon so as to systematic sampling considering respondents' life cycle. 2.2 Two-step cluster Analysis Our study is accomplished through the use of a two-step cluster method to classify the accommodation in a reduced number of groups, so that each group constitutes a type. This method had been suggested as appropriate in clustering large data sets with mixed attributes. The method is based on a distance measure that enables data with both continuous and categorical attributes to be clustered. This is derived from a probabilistic model in which the distance between two clusters in equivalent to the decrease in log-likelihood function as a result of merging. 2.3 Multinomial Logit Analysis The estimation of a Multionmial Logit model determines the characteristics of tourist who is most likely to opt for each type of accommodation. The Multinomial Logit model constitutes an appropriate framework to explore and explain choice process where the choice set consists of more than two alternatives. Due to its ease and quick estimation of parameters, the Multinomial Logit model has been used for many empirical studies of choice in tourism. 3. Findings The auto-clustering algorithm indicated that a five-cluster solution was the best model, because it minimized the BIC value and the change in them between adjacent numbers of clusters. The accommodation establishments can be classified into five types: Traditional House, Typical Farmhouse, Farmstay house for group Tour, Log Cabin for Family, and Log Cabin for Individuals. Group 1 (Traditional House) includes mainly the large accommodation establishments, i.e. those with ondoll style room providing meals and one shower room on family tourist, of original construction style house. Group 2 (Typical Farmhouse) encompasses accommodation establishments of Ondoll rooms and each bathroom providing meals. It includes, in other words, the tourist accommodations Known as "rural houses." Group 3 (Farmstay House for Group) has accommodation establishments of Ondoll rooms not providing meals and self cooking facilities, large room size over five persons. Group 4 (Log Cabin for Family) includes mainly the popular accommodation establishments, i.e. those with Ondoll style room with on shower room on family tourist, of western styled log house. While the accommodations in this group are not defined as regards type of construction, the group does include all the original Korean style construction, Finally, group 5 (Log Cabin for Individuals)includes those accommodations that are bedroom western styled wooden house with each bathroom. First Multinomial Logit model is estimated including all the explicative variables considered and taking accommodation group 2 as base alternative. The results show that the variables and the estimated values of the parameters for the model giving the probability of each of the five different types of accommodation available in rural tourism village in Korea, according to the socio-economic and trip related characteristics of the individuals. An initial observation of the analysis reveals that none of variables income, the number of journey, distance, and residential style of house is explicative in the choice of rural accommodation. The age and accompany variables are significant for accommodation establishment of group 1. The education and rural residential experience variables are significant for accommodation establishment of groups 4 and 5. The expenditure and marital status variables are significant for accommodation establishment of group 4. The gender and occupation variable are significant for accommodation establishment of group 3. The loyalty variable is significant for accommodation establishment of groups 3 and 4. The study indicates that significant differences exist among the individuals who choose each type of accommodation at a destination. From this investigation is evident that several profiles of tourists can be attracted by a rural destination according to the types of existing accommodations at this destination. Besides, the tourist profiles may be used as the basis for investment policy and promotion for each type of accommodation, making use in each case of the variables that indicate a greater likelihood of influencing the tourist choice of accommodation.

  • PDF

Impact of Lambertian Cloud Top Pressure Error on Ozone Profile Retrieval Using OMI (램버시안 구름 모델의 운정기압 오차가 OMI 오존 프로파일 산출에 미치는 영향)

  • Nam, Hyeonshik;Kim, Jae Hawn;Shin, Daegeun;Baek, Kanghyun
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.3
    • /
    • pp.347-358
    • /
    • 2019
  • Lambertian cloud model (Lambertian Cloud Model) is the simplified cloud model which is used to effectively retrieve the vertical ozone distribution of the atmosphere where the clouds exist. By using the Lambertian cloud model, the optical characteristics of clouds required for radiative transfer simulation are parametrized by Optical Centroid Cloud Pressure (OCCP) and Effective Cloud Fraction (ECF), and the accuracy of each parameter greatly affects the radiation simulation accuracy. However, it is very difficult to generalize the vertical ozone error due to the OCCP error because it varies depending on the radiation environment and algorithm setting. In addition, it is also difficult to analyze the effect of OCCP error because it is mixed with other errors that occur in the vertical ozone calculation process. This study analyzed the ozone retrieval error due to OCCP error using two methods. First, we simulated the impact of OCCP error on ozone retrieval based on Optimal Estimation. Using LIDORT radiation model, the radiation error due to the OCCP error is calculated. In order to convert the radiation error to the ozone calculation error, the radiation error is assigned to the conversion equation of the optimal estimation method. The results show that when the OCCP error occurs by 100 hPa, the total ozone is overestimated by 2.7%. Second, a case analysis is carried out to find the ozone retrieval error due to OCCP error. For the case analysis, the ozone retrieval error is simulated assuming OCCP error and compared with the ozone error in the case of PROFOZ 2005-2006, an OMI ozone profile product. In order to define the ozone error in the case, we assumed an ideal assumption. Considering albedo, and the horizontal change of ozone for satisfying the assumption, the 49 cases are selected. As a result, 27 out of 49 cases(about 55%)showed a correlation of 0.5 or more. This result show that the error of OCCP has a significant influence on the accuracy of ozone profile calculation.

A Comparison between Multiple Satellite AOD Products Using AERONET Sun Photometer Observations in South Korea: Case Study of MODIS,VIIRS, Himawari-8, and Sentinel-3 (우리나라에서 AERONET 태양광도계 자료를 이용한 다종위성 AOD 산출물 비교평가: MODIS, VIIRS, Himawari-8, Sentinel-3의 사례연구)

  • Kim, Seoyeon;Jeong, Yemin;Youn, Youjeong;Cho, Subin;Kang, Jonggu;Kim, Geunah;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.543-557
    • /
    • 2021
  • Because aerosols have different spectral characteristics according to the size and composition of the particle and to the satellite sensors, a comparative analysis of aerosol products from various satellite sensors is required. In South Korea, however, a comprehensive study for the comparison of various official satellite AOD (Aerosol Optical Depth) products for a long period is not easily found. In this paper, we aimed to assess the performance of the AOD products from MODIS (Moderate Resolution Imaging Spectroradiometer), VIIRS (Visible Infrared Imaging Radiometer Suite), Himawari-8, and Sentinel-3 by referring to the AERONET (Aerosol Robotic Network) sun photometer observations for the period between January 2015 and December 2019. Seasonal and geographical characteristics of the accuracy of satellite AOD were also analyzed. The MODIS products, which were accumulated for a long time and optimized by the new MAIAC (Multiangle Implementation of Atmospheric Correction) algorithm, showed the best accuracy (CC=0.836) and were followed by the products from VIIRS and Himawari-8. On the other hand, Sentinel-3 AOD did not appear to have a good quality because it was recently launched and not sufficiently optimized yet, according to ESA (European Space Agency). The AOD of MODIS, VIIRS, and Himawari-8 did not show a significant difference in accuracy according to season and to urban vs. non-urban regions, but the mixed pixel problem was partly found in a few coastal regions. Because AOD is an essential component for atmospheric correction, the result of this study can be a reference to the future work for the atmospheric correction for the Korean CAS (Compact Advanced Satellite) series.