• Title/Summary/Keyword: Classification and regression tree(CART)

Search Result 92, Processing Time 0.017 seconds

Analysis of Environmental Factors of Geomorphology, Hydrology, Water Quality and Shoreline Soil in Reservoirs of Korea (우리나라 저수지에서 지형, 수문, 수질 및 호안 토양 환경요인의 분석)

  • Cho, HyunSuk;Cho, Kang-Hyun
    • Korean Journal of Ecology and Environment
    • /
    • v.46 no.3
    • /
    • pp.343-359
    • /
    • 2013
  • In order to understand shoreline environment characteristics of Korean reservoirs, the interrelationships between environmental factors of geomorphology, hydrology, water quality and shoreline soil were analyzed, and the reservoir types were classified according to their environmental characteristics in the 35 reservoirs selected by considering the purpose of dam operations and annual water-level fluctuations. Geomorphological and hydrological characteristics of reservoirs were correlated with the altitude and the size scale of reservoirs. The annual range of water level fluctuation showed a wide variation from 1 m to 27 m in the various reservoirs in Korea. The levels of eutrophication of most reservoirs were mesotrophic or eutrophic. From the result of the soil texture analysis, sand contents were high in reservoir shorelines. Range, frequency and duration of water-level fluctuation were distinctive from the primary function of reservoirs. Flood control reservoirs had a wide range with low frequency and waterpower generation reservoirs had a narrow range with high frequency in the water-level fluctuation. According to the result of CART (classification and regression tree) analysis, the water quality of reservoirs was classified by water depth, range of water-level fluctuation and altitude. The result of PCA (principal component analysis) showed that the type of reservoirs was classified by reservoir size, water-level fluctuation, water quality, soil texture and soil organic matter. In conclusion, reservoir size, the water-level fluctuation, water quality and soil characteristics might be major factors in the environment of reservoir shorelines in Korea.

An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels (호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법)

  • Moon, Hyun Sil;Sung, David;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.21-41
    • /
    • 2019
  • Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.