• Title/Summary/Keyword: k-means 군집 알고리즘

Search Result 191, Processing Time 0.027 seconds

A Resource Clustering Method Considering Weight of Application Characteristic in Hybrid Cloud Environment (하이브리드 클라우드 환경에서의 응용 특성 가중치를 고려한 자원 군집화 기법)

  • Oh, Yoori;Kim, Yoonhee
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.481-486
    • /
    • 2017
  • There are many scientists who want to perform experiments in a cloud environment, and pay-per-use services allow scientists to pay only for cloud services that they need. However, it is difficult for scientists to select a suitable set of resources since those resources are comprised of various characteristics. Therefore, classification is needed to support the effective utilization of cloud resources. Thus, a dynamic resource clustering method is needed to reflect the characteristics of the application that scientists want to execute. This paper proposes a resource clustering analysis method that takes into account the characteristics of an application in a hybrid cloud environment. The resource clustering analysis applies a Self-Organizing Map and K-means algorithm to dynamically cluster similar resources. The results of the experiment indicate that the proposed method can classify a similar resource cluster by reflecting the application characteristics.

Context-awareness Clustering with Adaptive Learning Algorithm (상황인식 기반 클러스터링의 적응적 자율 학습 분할 알고리즘)

  • Jeon, Il-Kyu;Lee, Kang-whan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.612-614
    • /
    • 2022
  • This paper propose a clustering algorithm for mobile nodes that possible more efficient clustering using context-aware attribute information in adaptive learning. In typically, the data will be provided to classify interrelationships within cluster properties. If a new properties are treated as contaminated information in comparative clustering, it can be treated as contaminated properties in comparison clustering. In this paper, To solve this problems in this paper, we have new present a context-awareness learning based model that can analyzes the clustering attributed parameters from the node properties using accumulated information properties.

  • PDF

K-means clustering analysis and differential protection policy according to 3D NAND flash memory error rate to improve SSD reliability

  • Son, Seung-Woo;Kim, Jae-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.11
    • /
    • pp.1-9
    • /
    • 2021
  • 3D-NAND flash memory provides high capacity per unit area by stacking 2D-NAND cells having a planar structure. However, due to the nature of the lamination process, there is a problem that the frequency of error occurrence may vary depending on each layer or physical cell location. This phenomenon becomes more pronounced as the number of write/erase(P/E) operations of the flash memory increases. Most flash-based storage devices such as SSDs use ECC for error correction. Since this method provides a fixed strength of data protection for all flash memory pages, it has limitations in 3D NAND flash memory, where the error rate varies depending on the physical location. Therefore, in this paper, pages and layers with different error rates are classified into clusters through the K-means machine learning algorithm, and differentiated data protection strength is applied to each cluster. We classify pages and layers based on the number of errors measured after endurance test, where the error rate varies significantly for each page and layer, and add parity data to stripes for areas vulnerable to errors to provides differentiate data protection strength. We show the possibility that this differentiated data protection policy can contribute to the improvement of reliability and lifespan of 3D NAND flash memory compared to the protection techniques using RAID-like or ECC alone.

Implementation of App System for Personalized Health Information Recommendation (사용자 맞춤형 건강정보 추천 앱 구현)

  • Park, Seong-min;Park, Jeong-soo;Lee, Yoon-kyu;Chae, Woo-Joon;Shin, Moon-sun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.316-318
    • /
    • 2019
  • Recently, healthy life has become an issue in an aging society, and the number of people who have been interested in continuous health care for better life is increasing. In this paper, we implemented a personalized recommendation systm to provide convenient healthcare management for user. The PHR (Personal Health Record) of user could be stored in the server along with health related information such as lifestyle, disease, and physical condition. The users could be classified into similar clusters according to the PHR profile in order to provide healthcare contents to the users who had similar PHR profile. K-Means clustering was applied to generate clusters based on PHR profile and ACDT(Ant Colony Decision Tree) algorithm was used to provide personalised recommendation of health information stored in knowledge base. The app system developed in this paper is useful for users to perform healthcare themselves by providing information on serious diseases and lifestyle habits to be improved according to the clusters classified by PHR profile.

  • PDF

Hierarchical and Incremental Clustering for Semi Real-time Issue Analysis on News Articles (준 실시간 뉴스 이슈 분석을 위한 계층적·점증적 군집화)

  • Kim, Hoyong;Lee, SeungWoo;Jang, Hong-Jun;Seo, DongMin
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.556-578
    • /
    • 2020
  • There are many different researches about how to analyze issues based on real-time news streams. But, there are few researches which analyze issues hierarchically from news articles and even a previous research of hierarchical issue analysis make clustering speed slower as the increment of news articles. In this paper, we propose a hierarchical and incremental clustering for semi real-time issue analysis on news articles. We trained siamese neural network based weighted cosine similarity model, applied this model to k-means algorithm which is used to make word clusters and converted news articles to document vectors by using these word clusters. Finally, we initialized an issue cluster tree from document vectors, updated this tree whenever news articles happen, and analyzed issues in semi real-time. Through the experiment and evaluation, we showed that up to about 0.26 performance has been improved in terms of NMI. Also, in terms of speed of incremental clustering, we also showed about 10 times faster than before.

Optimal Arrangement of Patrol Ships based on k-Means Clustering for Quick Response of Marine Accidents (해양사고 신속대응을 위한 k-평균 군집화 기반 경비함정 최적배치)

  • Yoo, Sang-Lok;Jung, Cho-Young
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.23 no.7
    • /
    • pp.775-782
    • /
    • 2017
  • The position of existing patrol ships has been decided according to subjective judgments, not purely by any reasonable or scientific criteria, because of a lack of access to marine accident positions. In this study, the optimal location of patrol ships is quantitatively determined based on historical marine accident data. The study area used included the coastal sea of Pohang in South Korea. In this study, a k-means clustering algorithm was used to derive the location of patrol ships, and then a Voronoi diagram was used to divide the region around each patrol ship. As a result, the average navigation distance for patrol ships was improved by 4.4 nautical miles, and the average arrival time was improved by 13.2 minutes per marine accident. Moreover, if the locations of patrol ships need to be changed flexibly, it will be possible to optimally arrange limited resources using the technique developed in this study to ensure a fast rescue.

A Study on the Clustering Method of Row and Multiplex Housing in Seoul Using K-Means Clustering Algorithm and Hedonic Model (K-Means Clustering 알고리즘과 헤도닉 모형을 활용한 서울시 연립·다세대 군집분류 방법에 관한 연구)

  • Kwon, Soonjae;Kim, Seonghyeon;Tak, Onsik;Jeong, Hyeonhee
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.95-118
    • /
    • 2017
  • Recent centrally the downtown area, the transaction between the row housing and multiplex housing is activated and platform services such as Zigbang and Dabang are growing. The row housing and multiplex housing is a blind spot for real estate information. Because there is a social problem, due to the change in market size and information asymmetry due to changes in demand. Also, the 5 or 25 districts used by the Seoul Metropolitan Government or the Korean Appraisal Board(hereafter, KAB) were established within the administrative boundaries and used in existing real estate studies. This is not a district classification for real estate researches because it is zoned urban planning. Based on the existing study, this study found that the city needs to reset the Seoul Metropolitan Government's spatial structure in estimating future housing prices. So, This study attempted to classify the area without spatial heterogeneity by the reflected the property price characteristics of row housing and Multiplex housing. In other words, There has been a problem that an inefficient side has arisen due to the simple division by the existing administrative district. Therefore, this study aims to cluster Seoul as a new area for more efficient real estate analysis. This study was applied to the hedonic model based on the real transactions price data of row housing and multiplex housing. And the K-Means Clustering algorithm was used to cluster the spatial structure of Seoul. In this study, data onto real transactions price of the Seoul Row housing and Multiplex Housing from January 2014 to December 2016, and the official land value of 2016 was used and it provided by Ministry of Land, Infrastructure and Transport(hereafter, MOLIT). Data preprocessing was followed by the following processing procedures: Removal of underground transaction, Price standardization per area, Removal of Real transaction case(above 5 and below -5). In this study, we analyzed data from 132,707 cases to 126,759 data through data preprocessing. The data analysis tool used the R program. After data preprocessing, data model was constructed. Priority, the K-means Clustering was performed. In addition, a regression analysis was conducted using Hedonic model and it was conducted a cosine similarity analysis. Based on the constructed data model, we clustered on the basis of the longitude and latitude of Seoul and conducted comparative analysis of existing area. The results of this study indicated that the goodness of fit of the model was above 75 % and the variables used for the Hedonic model were significant. In other words, 5 or 25 districts that is the area of the existing administrative area are divided into 16 districts. So, this study derived a clustering method of row housing and multiplex housing in Seoul using K-Means Clustering algorithm and hedonic model by the reflected the property price characteristics. Moreover, they presented academic and practical implications and presented the limitations of this study and the direction of future research. Academic implication has clustered by reflecting the property price characteristics in order to improve the problems of the areas used in the Seoul Metropolitan Government, KAB, and Existing Real Estate Research. Another academic implications are that apartments were the main study of existing real estate research, and has proposed a method of classifying area in Seoul using public information(i.e., real-data of MOLIT) of government 3.0. Practical implication is that it can be used as a basic data for real estate related research on row housing and multiplex housing. Another practical implications are that is expected the activation of row housing and multiplex housing research and, that is expected to increase the accuracy of the model of the actual transaction. The future research direction of this study involves conducting various analyses to overcome the limitations of the threshold and indicates the need for deeper research.

A Study on Research Paper Classification Using Keyword Clustering (키워드 군집화를 이용한 연구 논문 분류에 관한 연구)

  • Lee, Yun-Soo;Pheaktra, They;Lee, JongHyuk;Gil, Joon-Min
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.12
    • /
    • pp.477-484
    • /
    • 2018
  • Due to the advancement of computer and information technologies, numerous papers have been published. As new research fields continue to be created, users have a lot of trouble finding and categorizing their interesting papers. In order to alleviate users' this difficulty, this paper presents a method of grouping similar papers and clustering them. The presented method extracts primary keywords from the abstracts of each paper by using TF-IDF. Based on TF-IDF values extracted using K-means clustering algorithm, our method clusters papers to the ones that have similar contents. To demonstrate the practicality of the proposed method, we use paper data in FGCS journal as actual data. Based on these data, we derive the number of clusters using Elbow scheme and show clustering performance using Silhouette scheme.

Recommand Movie Based on Scenario in Movie Characters' Social Networks (영화 등장인물의 사회관계망에서 시나리오를 기반으로 하는 영화 추천 기법)

  • Heo, Joo-Seong;Kim, Tae-Hyeong;Seo, Jang-Won;Lee, Ye-Young;Han, Youn-Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.1134-1137
    • /
    • 2015
  • '영화 시나리오를 기반으로 영화를 어떻게 추천할 수 있는가'에서 본 논문에서는 전통적인 사회관계망 분석 지표 중 그래프의 평균 길이와 평균 군집도 그리고 밀도를 이용하여 3차원의 데이터 집합을 산출했고, 산출한 데이터 집합을 기반으로 k-means 군집화 알고리즘을 활용하여 각 k 값에 따른 영화를 추천해보았다. 그 결과 기타 여느 추천들과 다른 추천결과를 도출해냈다.

Performance Evaluation of Nonhomogeneity Detector According to Various Normalization Methods in Nonhomogeneous Clutter Environment (불균일한 클러터 환경 안에서 Nonhomogeneity Detector의 다양한 정규화 방법에 따른 성능 평가)

  • Ryu, Jang-Hee;Jeong, Ji-Chai
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.10 no.1
    • /
    • pp.72-79
    • /
    • 2009
  • This paper describes the performance evaluation of NHD(nonhomogeneity detector) for STAP(space-time adaptive processing) airborne radar according to various normalization methods in the nonhomogeneous clutter environment. In practice, the clutter can be characterized as random variation signals, because it sometimes includes signals with very large magnitude like impulsive signal due to the system environment. The received interference signals are composed of homogeneous and nonhomogeneous data. In this situation, NHB is needed to maintain the STAP performance. The normalization using the NHD result is an effective method for removing the nonhomogeneous data. The optimum normalization can be performed by a representative value considered with a characteristic of the given data, so we propose the K-means clustering algorithm. The characteristic of random variation data due to nonhomogeneous clutters can be considered by the number of clusters, and then the representative value for selecting the homogeneous data is determined in the clustering result. In order to reflect a characteristic of the nonstationary interference data, we also investigate the algorithm for a calculation of the proper number of clusters. Through our simulations, we verified that the K-means clustering algorithm has very superior normalization and target detection performances compared with the previous introduced normalization methods.

  • PDF