• Title/Summary/Keyword: K-Means Clustering

Search Result 1,117, Processing Time 0.027 seconds

Selecting Climate Change Scenarios Reflecting Uncertainties (불확실성을 고려한 기후변화 시나리오의 선정)

  • Lee, Jae-Kyoung;Kim, Young-Oh
    • Atmosphere
    • /
    • v.22 no.2
    • /
    • pp.149-161
    • /
    • 2012
  • Going by the research results of the past, of all the uncertainties resulting from the research on climate change, the uncertainty caused by the climate change scenario has the highest degree of uncertainty. Therefore, depending upon what kind of climate change scenario one adopts, the projection of the water resources in the future will differ significantly. As a matter of principle, it is highly recommended to utilize all the GCM scenarios offered by the IPCC. However, this could be considered to be an impractical alternative if a decision has to be made at an action officer's level. Hence, as an alternative, it is deemed necessary to select several scenarios so as to express the possible number of cases to the maximum extent possible. The objective standards in selecting the climate change scenarios have not been properly established and the scenarios have been selected, either at random or subject to the researcher's discretion. In this research, a new scenario selection process, in which it is possible to have the effect of having utilized all the possible scenarios, with using only a few principal scenarios and maintaining some of the uncertainties, has been suggested. In this research, the use of cluster analysis and the selection of a representative scenario in each cluster have efficiently reduced the number of climate change scenarios. In the cluster analysis method, the K-means clustering method, which takes advantage of the statistical features of scenarios has been employed; in the selection of a representative scenario in each cluster, the selection method was analyzed and reviewed and the PDF method was used to select the best scenarios with the closest simulation accuracy and the principal scenarios that is suggested by this research. In the selection of the best scenarios, it has been shown that the GCM scenario which demonstrated high level of simulation accuracy in the past need not necessarily demonstrate the similarly high level of simulation accuracy in the future and various GCM scenarios were selected for the principal scenarios. Secondly, the "Maximum entropy" which can quantify the uncertainties of the climate change scenario has been used to both quantify and compare the uncertainties associated with all the scenarios, best scenarios and the principal scenarios. Comparison has shown that the principal scenarios do maintain and are able to better explain the uncertainties of all the scenarios than the best scenarios. Therefore, through the scenario selection process, it has been proven that the principal scenarios have the effect of having utilized all the scenarios and retaining the uncertainties associated with the climate change to the maximum extent possible, while reducing the number of scenarios at the same time. Lastly, the climate change scenario most suitable for the climate on the Korean peninsula has been suggested. Through the scenario selection process, of all the scenarios found in the 4th IPCC report, principal climate change scenarios, which are suitable for the Korean peninsula and maintain most of the uncertainties, have been suggested. Therefore, it is assessed that the use of the scenario most suitable for the future projection of water resources on the Korean peninsula will be able to provide the projection of the water resources management that maintains more than 70~80% level of uncertainties of all the scenarios.

The Effects of Sidecar on Index Arbitrage Trading and Non-index Arbitrage Trading:Evidence from the Korean Stock Market (한국주식시장에서 사이드카의 역할과 재설계: 차익거래와 비차익거래에 미치는 효과를 중심으로)

  • Park, Jong-Won;Eom, Yun-Sung;Chang, Uk
    • The Korean Journal of Financial Management
    • /
    • v.24 no.3
    • /
    • pp.91-131
    • /
    • 2007
  • In the paper, the effects of sidecar on index arbitrage trading and non-index arbitrage trading in the Korean stock market are examined. The analyses of return, volatility, and liquidity dynamics illustrate that there are no distinct differences for index arbitrage group and non-index arbitrage group surrounding the sidecar events. For further analysis, we construct pseudo-sidecar sample and analyse the effects of the actual sidecar and pseudo-sidecar on arbitrage sample and non-index arbitrage sample. The result of analysis using pseudo-sidecar shows that the differences between index arbitrage group and non-index arbitrage group are larger in pseudo-sidecar sample than in actual sidecar sample. This means that former results can be explained by temporary order clustering in one side before and after the event. Sidecar has little effect on non-index arbitrage group, however, it has relatively large effect on arbitrage group. These results imply that it needs to redesign the sidecar system of the Korean stock market which applies for all program trading including arbitrage and non-index arbitrage trading.

  • PDF

Recognition of Flat Type Signboard using Deep Learning (딥러닝을 이용한 판류형 간판의 인식)

  • Kwon, Sang Il;Kim, Eui Myoung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.4
    • /
    • pp.219-231
    • /
    • 2019
  • The specifications of signboards are set for each type of signboards, but the shape and size of the signboard actually installed are not uniform. In addition, because the colors of the signboard are not defined, so various colors are applied to the signboard. Methods for recognizing signboards can be thought of as similar methods of recognizing road signs and license plates, but due to the nature of the signboards, there are limitations in that the signboards can not be recognized in a way similar to road signs and license plates. In this study, we proposed a methodology for recognizing plate-type signboards, which are the main targets of illegal and old signboards, and automatically extracting areas of signboards, using the deep learning-based Faster R-CNN algorithm. The process of recognizing flat type signboards through signboard images captured by using smartphone cameras is divided into two sequences. First, the type of signboard was recognized using deep learning to recognize flat type signboards in various types of signboard images, and the result showed an accuracy of about 71%. Next, when the boundary recognition algorithm for the signboards was applied to recognize the boundary area of the flat type signboard, the boundary of flat type signboard was recognized with an accuracy of 85%.

Personalized insurance product based on similarity (유사도를 활용한 맞춤형 보험 추천 시스템)

  • Kim, Joon-Sung;Cho, A-Ra;Oh, Hayong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.11
    • /
    • pp.1599-1607
    • /
    • 2022
  • The data mainly used for the model are as follows: the personal information, the information of insurance product, etc. With the data, we suggest three types of models: content-based filtering model, collaborative filtering model and classification models-based model. The content-based filtering model finds the cosine of the angle between the users and items, and recommends items based on the cosine similarity; however, before finding the cosine similarity, we divide into several groups by their features. Segmentation is executed by K-means clustering algorithm and manually operated algorithm. The collaborative filtering model uses interactions that users have with items. The classification models-based model uses decision tree and random forest classifier to recommend items. According to the results of the research, the contents-based filtering model provides the best result. Since the model recommends the item based on the demographic and user features, it indicates that demographic and user features are keys to offer more appropriate items.

Identification of Employee Experience Factors and Their Influence on Job Satisfaction (직원경험 요인 파악 및 직무 만족도에 끼치는 영향력 분석)

  • Juhyeon Lee;So-Hyun Lee;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.25 no.2
    • /
    • pp.181-203
    • /
    • 2023
  • With the fierce competition of companies for the attraction of outstanding individuals, job satisfaction of employees has been of importance. In this circumstance, many companies try to invest in job satisfaction improvement by finding employees' everyday experiences and difficulties. However, due to a lack of understanding of the employee experience, their investments are not paying off. This study examined the relationship between employee experience and job satisfaction using employee reviews and company ratings from Glassdoor, one of the largest employee communities worldwide. We use text mining techniques such as K-means clustering and LDA topic-based sentiment analysis to extract key experience factors by job level, and DistilBERT sentiment analysis to measure the sentiment score of each employee experience factor. The drawn employee experience factors and each sentiment score were analyzed quantitatively, and thereby relations between each employee experience factor and job satisfaction were analyzed. As a result, this study found that there is a significant difference between the workplace experiences of managers and general employees. In addition, employee experiences that affect job satisfaction also differed between positions, such as customer relationship and autonomy, which did not affect the satisfaction of managers. This study used text mining and quantitative modeling method based on theory of work adjustment so as to find and verify main factors of employee experience, and thus expanded research literature. In addition, the results of this study are applicable to the personnel management strategy for improving employees' job satisfaction, and are expected to improve corporate productivity ultimately.

Clustering according to Inpatients' Opinion on Hospital Foodservice and Analyzing Inpatient Response to Foodservice Qualify and Revisit Intention by the Cluster: In Case of S Hospital (입원환자의 급식서비스 인식에 따른 고객 군집화 및 군집별 급식서비스 질 평가, 재이용 의도 분석: S병원을 대상으로)

  • Lee, Hae-Young;Chang, Seung-Hee
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.35 no.10
    • /
    • pp.1491-1497
    • /
    • 2006
  • The purpose of this study was to analyze the relationship among inpatients' perceptions of foodservice quality, satisfaction and revisit intention. Questionnaires were hand-delivered to 350 inpatients and a total of 230 questionnaires were usable (response rate 65.7%), Statistical data analysis was completed using the SPSS Win 11.0 for descriptive analysis, independent t-test, $x^2$ test and k-means cluster analysis. The results of this study can be summarized as follows: The average score of overall importance of meal service in medical service was 4.25 out of 5.0, yet the score of overall quality of meal service and value had lower than importance score. A helpfulness to medical treatment (3.48), bringing customer happiness (3.18), overall satisfaction for foodservice (3.66), satisfaction based on expectation before discharge (3.53) and offering foodservice apt to hospital reputation (3.40) were measured as expressions of satisfaction. As a result of clustering analysis, two clusters were classified and named as affirmative opinion group and negative one. Expectation for four factors of foodservice quality between two groups had no significance. But affirmative opinion group had significantly higher score than negative one in perception and satisfaction. Affirmative customers' intention to revisit in the near future was evaluated as high in both considering general medical service (4.04) and reflecting meal service level (3.84).

Analysis of Utilization Characteristics, Health Behaviors and Health Management Level of Participants in Private Health Examination in a General Hospital (일개 종합병원의 민간 건강검진 수검자의 검진이용 특성, 건강행태 및 건강관리 수준 분석)

  • Kim, Yoo-Mi;Park, Jong-Ho;Kim, Won-Joong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.1
    • /
    • pp.301-311
    • /
    • 2013
  • This study aims to analyze characteristics, health behaviors and health management level related to private health examination recipients in one general hospital. To achieve this, we analyzed 150,501 cases of private health examination data for 11 years from 2001 to 2011 for 20,696 participants in 2011 in a Dae-Jeon general hospital health examination center. The cluster analysis for classify private health examination group is used z-score standardization of K-means clustering method. The logistic regression analysis, decision tree and neural network analysis are used to periodic/non-periodic private health examination classification model. 1,000 people were selected as a customer management business group that has high probability to be non-periodic private health examination patients in new private health examination. According to results of this study, private health examination group was categorized by new, periodic and non-periodic group. New participants in private health examination were more 30~39 years old person than other age groups and more patients suspected of having renal disease. Periodic participants in private health examination were more male participants and more patients suspected of having hyperlipidemia. Non-periodic participants in private health examination were more smoking and sitting person and more patients suspected of having anemia and diabetes mellitus. As a result of decision tree, variables related to non-periodic participants in private health examination were sex, age, residence, exercise, anemia, hyperlipidemia, diabetes mellitus, obesity and liver disease. In particular, 71.4% of non-periodic participants were female, non-anemic, non-exercise, and suspicious obesity person. To operation of customized customer management business for private health examination will contribute to efficiency in health examination center.

Development of Customer Sentiment Pattern Map for Webtoon Content Recommendation (웹툰 콘텐츠 추천을 위한 소비자 감성 패턴 맵 개발)

  • Lee, Junsik;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.67-88
    • /
    • 2019
  • Webtoon is a Korean-style digital comics platform that distributes comics content produced using the characteristic elements of the Internet in a form that can be consumed online. With the recent rapid growth of the webtoon industry and the exponential increase in the supply of webtoon content, the need for effective webtoon content recommendation measures is growing. Webtoons are digital content products that combine pictorial, literary and digital elements. Therefore, webtoons stimulate consumer sentiment by making readers have fun and engaging and empathizing with the situations in which webtoons are produced. In this context, it can be expected that the sentiment that webtoons evoke to consumers will serve as an important criterion for consumers' choice of webtoons. However, there is a lack of research to improve webtoons' recommendation performance by utilizing consumer sentiment. This study is aimed at developing consumer sentiment pattern maps that can support effective recommendations of webtoon content, focusing on consumer sentiments that have not been fully discussed previously. Metadata and consumer sentiments data were collected for 200 works serviced on the Korean webtoon platform 'Naver Webtoon' to conduct this study. 488 sentiment terms were collected for 127 works, excluding those that did not meet the purpose of the analysis. Next, similar or duplicate terms were combined or abstracted in accordance with the bottom-up approach. As a result, we have built webtoons specialized sentiment-index, which are reduced to a total of 63 emotive adjectives. By performing exploratory factor analysis on the constructed sentiment-index, we have derived three important dimensions for classifying webtoon types. The exploratory factor analysis was performed through the Principal Component Analysis (PCA) using varimax factor rotation. The three dimensions were named 'Immersion', 'Touch' and 'Irritant' respectively. Based on this, K-Means clustering was performed and the entire webtoons were classified into four types. Each type was named 'Snack', 'Drama', 'Irritant', and 'Romance'. For each type of webtoon, we wrote webtoon-sentiment 2-Mode network graphs and looked at the characteristics of the sentiment pattern appearing for each type. In addition, through profiling analysis, we were able to derive meaningful strategic implications for each type of webtoon. First, The 'Snack' cluster is a collection of webtoons that are fast-paced and highly entertaining. Many consumers are interested in these webtoons, but they don't rate them well. Also, consumers mostly use simple expressions of sentiment when talking about these webtoons. Webtoons belonging to 'Snack' are expected to appeal to modern people who want to consume content easily and quickly during short travel time, such as commuting time. Secondly, webtoons belonging to 'Drama' are expected to evoke realistic and everyday sentiments rather than exaggerated and light comic ones. When consumers talk about webtoons belonging to a 'Drama' cluster in online, they are found to express a variety of sentiments. It is appropriate to establish an OSMU(One source multi-use) strategy to extend these webtoons to other content such as movies and TV series. Third, the sentiment pattern map of 'Irritant' shows the sentiments that discourage customer interest by stimulating discomfort. Webtoons that evoke these sentiments are hard to get public attention. Artists should pay attention to these sentiments that cause inconvenience to consumers in creating webtoons. Finally, Webtoons belonging to 'Romance' do not evoke a variety of consumer sentiments, but they are interpreted as touching consumers. They are expected to be consumed as 'healing content' targeted at consumers with high levels of stress or mental fatigue in their lives. The results of this study are meaningful in that it identifies the applicability of consumer sentiment in the areas of recommendation and classification of webtoons, and provides guidelines to help members of webtoons' ecosystem better understand consumers and formulate strategies.

A Study on the Satisfaction of Self-Employed (만족도를 이용한 자영업에 관한 연구)

  • Oh, Yu-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.2
    • /
    • pp.281-296
    • /
    • 2009
  • This study examines the job and life satisfactions of the self-employed. It uses the Korean Labour and Income Panel Study(KLIPS, hereafter) data for 1998 and 2004. We examine the phases of satisfaction and what variables influence satisfaction for both years and compare the results in order to see what changed between the two regimes. We make use of k-means clustering to divide self-employed into similar degrees of satisfaction. As a result, we are able to classify the self-employed into three groups(low, medium and high) both for the two regimes. High groups consists of relatively younger, well-educated, low working dates, higher proportion of woman than other groups. As a result of regression analysis, we have some evidence that women are more satisfied than men for job satisfaction and that the existence of income is more important than the amount of income for life satisfaction. The age, education, satisfaction for working place, and health are significant to both satisfactions.

Estimation of Drought Rainfall by Regional Frequency Analysis Using L and LH-Moments (II) - On the method of LH-moments - (L 및 LH-모멘트법과 지역빈도분석에 의한 가뭄우량의 추정 (II)- LH-모멘트법을 중심으로 -)

  • Lee, Soon-Hyuk;Yoon , Seong-Soo;Maeng , Sung-Jin;Ryoo , Kyong-Sik;Joo , Ho-Kil;Park , Jin-Seon
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.46 no.5
    • /
    • pp.27-39
    • /
    • 2004
  • In the first part of this study, five homogeneous regions in view of topographical and geographically homogeneous aspects except Jeju and Ulreung islands in Korea were accomplished by K-means clustering method. A total of 57 rain gauges were used for the regional frequency analysis with minimum rainfall series for the consecutive durations. Generalized Extreme Value distribution was confirmed as an optimal one among applied distributions. Drought rainfalls following the return periods were estimated by at-site and regional frequency analysis using L-moments method. It was confirmed that the design drought rainfalls estimated by the regional frequency analysis were shown to be more appropriate than those by the at-site frequency analysis. In the second part of this study, LH-moment ratio diagram and the Kolmogorov-Smirnov test on the Gumbel (GUM), Generalized Extreme Value (GEV), Generalized Logistic (GLO) and Generalized Pareto (GPA) distributions were accomplished to get optimal probability distribution. Design drought rainfalls were estimated by both at-site and regional frequency analysis using LH-moments and GEV distribution, which was confirmed as an optimal one among applied distributions. Design rainfalls were estimated by at-site and regional frequency analysis using LH-moments, the observed and simulated data resulted from Monte Carlotechniques. Design drought rainfalls derived by regional frequency analysis using L1, L2, L3 and L4-moments (LH-moments) method have shown higher reliability than those of at-site frequency analysis in view of RRMSE (Relative Root-Mean-Square Error), RBIAS (Relative Bias) and RR (Relative Reduction) for the estimated design drought rainfalls. Relative efficiency were calculated for the judgment of relative merits and demerits for the design drought rainfalls derived by regional frequency analysis using L-moments and L1, L2, L3 and L4-moments applied in the first report and second report of this study, respectively. Consequently, design drought rainfalls derived by regional frequency analysis using L-moments were shown as more reliable than those using LH-moments. Finally, design drought rainfalls for the classified five homogeneous regions following the various consecutive durations were derived by regional frequency analysis using L-moments, which was confirmed as a more reliable method through this study. Maps for the design drought rainfalls for the classified five homogeneous regions following the various consecutive durations were accomplished by the method of inverse distance weight and Arc-View, which is one of GIS techniques.