Search | Korea Science

Mining Clusters of Sequence Data using Sequence Element-based Similarity Measure (시퀀스 요소 기반의 유사도를 이용한 시퀀스 데이터 클러스터링)

오승준;김재련
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 2004.11a
- /
- pp.221-229
- /
- 2004
Recently, there has been enormous growth in the amount of commercial and scientific data, such as protein sequences, retail transactions, and web-logs. Such datasets consist of sequence data that have an inherent sequential nature. However, only a few of the existing clustering algorithms consider sequentiality. This study presents a method for clustering such sequence datasets. The similarity between sequences must be decided before clustering the sequences. This study proposes a new similarity measure to compute the similarity between two sequences using a sequence element. Two clustering algorithms using the proposed similarity measure are proposed: a hierarchical clustering algorithm and a scalable clustering algorithm that uses sampling and a k-nearest neighbor method. Using a splice dataset and synthetic datasets, we show that the quality of clusters generated by our proposed clustering algorithms is better than that of clusters produced by traditional clustering algorithms.
PDF

Advance Neuro-Fuzzy Modeling Using a New Clustering Algorithm (새로운 클러스터링 알고리듬을 적용한 향상된 뉴로-퍼지 모델링)

김승석;김성수;유정웅
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.53 no.7
- /
- pp.536-543
- /
- 2004
In this paper, we proposed a new method of modeling a neuro-fuzzy system using a hybrid clustering algorithm. The initial parameters and the number of clusters of the proposed system are optimally chosen simultaneously with respect to the process of regression, which is a unique characteristics of the proposed system. The proposed algorithm presented in this work improves the overall performance of the proposed a neuro-fuzzy system by choosing a proper number of clusters adaptively according the characteristics of given data. The process of clustering is performed by deciding on the number of classes, which yields the property of convergence of the system. In experiments, the superiority of the proposed neuro-fuzzy system is demonstrated, especially the process of optimizing parameters and clustering of learning speed.
PDF KSCI

The Clustering Method Of Central Control System In New Distribution Automation System (배전자동화시스템 중앙제어장치 이중화 적용방안)

Cho, Nam-Hun;Ha, Bok-Nam;Lee, Jung-Ho;Lim, Seong-Il
- Proceedings of the KIEE Conference
- /
- 1999.07c
- /
- pp.1120-1122
- /
- 1999
This paper introduces a clustering for Central Control System in New Distribution Automation System. There are three primary benefits to use clustering: improved availability, easier manageability and more cost-effective scalability. Availability: Clustering can automatically detect the failure of an application or server and quickly restart it on a surviving server. Clients only experience a momentary pause in service. Manageability: Clustering lets administrators quickly inspect the status of all cluster resources and easily move workload around onto different servers within a cluster. Scalability: Applications can use the Clustering services through the MSCS Application Programming Interface(API) to do dynamic load balancing and scale across multiple servers within a cluster.
PDF

An Survey on the Power System Modeling using a Clustering Algorithm (클러스터링 기법을 적용한 전력시스템 모델링에 관한 사례 조사)

Park, Young-Soo;Kim, Jin-Ho
- Proceedings of the KIEE Conference
- /
- 2006.07a
- /
- pp.410-411
- /
- 2006
This paper is focused on the survey on the power system modeling using a clustering algorithm. In electricity markets, clustering method is a efficient tool to model the power system. It can be seen that electricity markets can also be classified into several groups which show similar patterns and that the fundamental characteristics of power systems can be widely applicable to other technical problems in power system such as generation scheduling, power flow analysis, short-term load forecasting, and so on. There are several researches on the power system modeling using a clustering algorithm. We specially surveyed their own clustering methods to model the power system.
PDF

A Study on the Robust Content-Based Musical Genre Classification System Using Multi-Feature Clustering (Multi-Feature Clustering을 이용한 강인한 내용 기반 음악 장르 분류 시스템에 관한 연구)

Yoon Won-Jung;Lee Kang-Kyu;Park Kyu-Sik
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.3 s.303
- /
- pp.115-120
- /
- 2005
In this paper, we propose a new robust content-based musical genre classification algorithm using multi-feature clustering(MFC) method. In contrast to previous works, this paper focuses on two practical issues of the system dependency problem on different input query patterns(or portions) and input query lengths which causes serious uncertainty of the system performance. In order to solve these problems, a new approach called multi-feature clustering(MFC) based on k-means clustering is proposed. To verify the performance of the proposed method, several excerpts with variable duration were extracted from every other position in a queried music file. Effectiveness of the system with MFC and without MFC is compared in terms of the classification accuracy. It is demonstrated that the use of MFC significantly improves the system stability of musical genre classification performance with higher accuracy rate.
PDF KSCI

Clustering Algorithm for Sequences of Categorical Values (범주형 값들이 순서를 가지고 있는 데이터들의 클러스터링 기법)

Oh Seung Joon;Kim Jae Yearn
- Proceedings of the Society of Korea Industrial and System Engineering Conference
- /
- 2002.05a
- /
- pp.125-132
- /
- 2002
We study clustering algorithm for sequences of categorical values. Clustering is a data mining problem that has received significant attention by the database community. Traditional clustering algorlthms deal with numerical or categorical data points. However, there exist many important databases that store categorical data sequences. In this paper we introduce new similarity measure and develope a hierarchical clustering algorithm. An experimental section shows performance of the proposed approach.
PDF

The transmission Network clustering using a fuzzy entropy function (퍼지 엔트로피 함수를 이용한 송전 네트워크 클러스터링)

Jang, Se-Hwan;Kim, Jin-Ho;Lee, Sang-Hyuk;Park, Jun-Ho
- Proceedings of the KIEE Conference
- /
- 2006.11a
- /
- pp.225-227
- /
- 2006
The transmission network clustering using a fuzzy entropy function are proposed in this paper. We can define a similarity measure through a fuzzy entropy. All node in the transmission network system has its own values indicating the physical characteristics of that system and the similarity measure in this paper is defined through the system-wide characteristic values at each node. However, to tackle the geometric mis-clustering problem, that is, to avoid the clustering of geometrically distant locations with similar measures, the locational informations are properly considered and incorporated in the proposed similarity measure. In this paper, a new regional clustering measure for the transmission network system is proposed and proved. The proposed measure is verified through IEEE 39 bus system.
PDF

Comparisons on Clustering Methods: Use of LMS Log Variables on Academic Courses

Jo, Il-Hyun;PARK, Yeonjeong;SONG, Jongwoo
- Educational Technology International
- /
- v.18 no.2
- /
- pp.159-191
- /
- 2017
Academic analytics guides university decision-makers to assign limited resources more effectively. Especially, diverse academic courses clustered by the usage patterns and levels on Learning Management System(LMS) help understanding instructors' pedagogical approach and the integration level of technologies. Further, the clustering results can contribute deciding proper range and levels of financial and technical supports. However, in spite of diverse analytic methodologies, clustering analysis methods often provide different results. The purpose of this study is to present implications by using three different clustering analysis including Gaussian Mixture Model, K-Means clustering, and Hierarchical clustering. As a case, we have clustered academic courses based on the usage levels and patterns of LMS in higher education using those three clustering techniques. In this study, 2,639 courses opened during 2013 fall semester in a large private university located in South Korea were analyzed with 13 observation variables that represent the characteristics of academic courses. The results of analysis show that the strengths and weakness of each clustering analysis and suggest that academic leaders and university staff should look into the usage levels and patterns of LMS with more elaborated view and take an integrated approach with different analytic methods for their strategic decision on development of LMS.
https://doi.org/10.23095/ETI.2017.18.2.159 인용 PDF

A Study of optimized clustering method based on SOM for CRM

Jong T. Rhee;Lee, Joon.
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 2001.01a
- /
- pp.464-469
- /
- 2001
CRM(Customer Relationship Management : CRM) is an advanced marketing supporting system which analyze customers\` transaction data and classify or target customer groups to effectively increase market share and profit. Many engines were developed to implements the function and those for classification and clustering are considered core ones. In this study, an improved clustering method based on SOM(Self-Organizing Maps : SOM) is proposed. The proposed clustering method finds the optimal number of clusters so that the effectiveness of clustering is increased. It considers all the data types existing in CRM data warehouses. In particular, and adaptive algorithm where the concepts of degeneration and fusion are applied to find optimal number of clusters. The feasibility and efficiency of the proposed method are demonstrated through simulation with simplified data of customers.
PDF

Hybrid Simulated Annealing for Data Clustering (데이터 클러스터링을 위한 혼합 시뮬레이티드 어닐링)

Kim, Sung-Soo;Baek, Jun-Young;Kang, Beom-Soo
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.40 no.2
- /
- pp.92-98
- /
- 2017
Data clustering determines a group of patterns using similarity measure in a dataset and is one of the most important and difficult technique in data mining. Clustering can be formally considered as a particular kind of NP-hard grouping problem. K-means algorithm which is popular and efficient, is sensitive for initialization and has the possibility to be stuck in local optimum because of hill climbing clustering method. This method is also not computationally feasible in practice, especially for large datasets and large number of clusters. Therefore, we need a robust and efficient clustering algorithm to find the global optimum (not local optimum) especially when much data is collected from many IoT (Internet of Things) devices in these days. The objective of this paper is to propose new Hybrid Simulated Annealing (HSA) which is combined simulated annealing with K-means for non-hierarchical clustering of big data. Simulated annealing (SA) is useful for diversified search in large search space and K-means is useful for converged search in predetermined search space. Our proposed method can balance the intensification and diversification to find the global optimal solution in big data clustering. The performance of HSA is validated using Iris, Wine, Glass, and Vowel UCI machine learning repository datasets comparing to previous studies by experiment and analysis. Our proposed KSAK (K-means+SA+K-means) and SAK (SA+K-means) are better than KSA(K-means+SA), SA, and K-means in our simulations. Our method has significantly improved accuracy and efficiency to find the global optimal data clustering solution for complex, real time, and costly data mining process.
https://doi.org/10.11627/jkise.2017.40.2.092 인용 PDF KSCI

Search Result 1,577, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)